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The Transmission Distortion of a Source as 
a Function of the Encoding 


Block Length* 


By R. J. PILC 
(Manuscript received December 15, 1967) 


This paper ts concerned with the transmission of a discrete, independent 
letter information source over a discrete channel. A distortion function is 
defined between source output letters and decoder output letters and is used 
to measure the performance of the system for each transmission. The 
coding block length is introduced as a variable and its influence upon the 
minimum attainable transmission distortion is investigated. 

The lower bound to transmission distortion is found to converge to 
the distortion level dg (C ts the channel capacity) algebraically as a/n. 
The nonnegative coefficient a 1s a function of both the source and channel 
statistics, which are interrelated in such a way as to suggest the utility of 
this coefficient as a measure of “mismatch” between source and channel, 
the larger the mismatch the slower the approach of the lower bound to the 
asymptote dg . For notwseless channels a = © and for this case the lower 
bound is shown to converge to dg as a,(In n)/n. 

For noisy channels the upper bound to transmission distortion ts found 
to converge to the asymptote dg algebraically as b[(In n)/n}*. For noiseless 
channels, the upper bound converges to dg as a,(In n)/n. 


*The material presented in this paper is based upon the author’s thesis, 
“Coding Theorems for Discrete Source-Channel Pairs,” presented to the Massa- 
chusetts Institute of Technology in November 1966 in partial fulfillment of the 
requirements for the degree of Doctor of Philosophy. 
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I. INTRODUCTION 


By now the results originally obtained by Shannon! relating relia- 
bility and channel capacity are well known. Roughly speaking, they 
state that perfect transmission can be achieved if, and only if, the 
capacity of the channel in the transmission link is greater than the 
information content of the source. For amplitude and time discrete 
sources the information content is the entropy of the source, but for 
amplitude continuous sources the entropy and the information con- 
tent are not the same since the information content is infinite. This, 
of course, implies that perfect transmission of amplitude continuous 
sources, or discrete sources with an entropy that is “too large,” is 
impossible with a given finite capacity channel. Yet this is just the 
situation that is often presented to the communication engineer who 
must then try to reduce the average distortion to the lowest possible, 
or practicable, level. 

For communication systems in which the capacity of the channel 
is not sufficient to allow perfect transmission, there are two obvious 
questions to ask: 


(‘) How small can the average distortion be made if any transmis- 
sion strategy at all is allowed? 

(1) How much does the system complexity, or cost, increase when 
you are required to get “closer” to this minimum? 


To answer the first question, Shannon generalized his results in a 
later paper? in which the channel requirements are found that. are 
necessary and sufficient to allow transmission at a given level of 
distortion, or a given error rate. It is our purpose here to consider 
the second question. We use the coding block length to measure the 
complexity of the system, and study the behavior of the minimum 
attainable transmission distortion as the block length is increased. 

In the work we restrict our attention to sources and channels that 
are discrete in amplitude and time, and that are constant and memory- 
less. This means that successive events are independent and are 
governed by the same probability distributions. The encoder is a 
block encoder that we describe later in this section. To measure the 
distortion in the system, we introduce a nonnegative function d(w,z) 
which gives the distortion in the event letter z is presented to the 
user at the decoder output when letter w was transmitted. Normally, 
this function would be specified by the user of the system to reflect 
how undesirable any particular misinterpretation of the source output 


TRANSMISSION DISTORTION 829 


is to him. We will assume that the distortion between two sequences 
of letters is the averaged sum of the composing letter distortions. 

Shannon’s theory associates with each source and distortion function 
a rate-distortion curve which expresses the minimum attainable trans- 
mission distortion in terms of the maximum allowable mutual in- 
formation in the system. Associated with each point (dpr,R) on the 
rate-distortion curve is a particular set of transition probabilities, 
called the “test channel,” which has the significance that among all 
channels that transmit the given source with distortion dp or less, it 
operates at the lowest transmission rate, R. Equivalently, the test 
channel is that channel which yields the lowest distortion dp among 
those that transmit information from the source at a rate FR or less. 
It is in this sense the cheapest channel one could use and meet a 
distortion criterion. The rate R can also be interpreted as the equi- 
valent information content of the source when a distortion dz is 
tolerable. 

That the rate-distortion curve gives the channel capacity sufficient 
to allow a prescribed performance is shown by Shannon through the 
intermediate step of proving that the rate-distortion curve actually 
expresses the entropy and resultant distortion in the “best” discrete 
representation of an output sequence from the original source. This 
discrete representation can then be transmitted with no further dis- 
tortion, if its entropy is less than the channel capacity, by the use 
of suitable channel coding techniques. 

Shannon has found the rate-distortion curves for many discrete 
sources and an explicit expression for this curve for time discrete 
gaussian sources. These results, together with Shannon’s work with 
vector sources, were used to get rate-distortion curves for gaussian 
random processes.*»* Bounds to the rate-distortion curve for non- 
gaussian sources have also been obtained.® ® 

However, all of the rate-distortion results derived for both con- 
tinuous and discrete sources are limiting results, that is, they can 
be approached in general only when arbitrarily complex operations 
on very long sequences of source output are allowed before transmit- 
ting the “message” through a correspondingly large use of the channel. 
T. Goblick was the first to study the rate of approach to these limit- 
ing results as the source output block length increases, but limited 
his work to source representation or source encoding, with a deter- 
ministic map between the source and its representation.? Our work 
includes a noisy channel, or probabilistic function, between the 
source and user. 
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A performance curve d(n) will be introduced for each source- 
channel pair as the minimum possible average distortion obtainable 
using a modulator that encodes a string of n successive source outputs 
into an input signal acceptable by a channel composed of 7 uses of 
the original channel. For a source with the rate-distortion curve of 
Fig. 1 and a channel with capacity C, the performance curve might 
look like the one shown in Fig. 2. 

From Shannon’s theory it is known that the performance curve 
starts at dy, the zero-rate distortion, and decreases to asymptotically 
approach dg, the distortion corresonding to the information rate C 
on the rate-distortion curve. The curve, of course, has meaning only 
for integral values of n. Not all modulators and decoders provide a 
distortion curve that approaches dg for large n, but this curve ob- 
viously must lie above the performance curve which alternately 
could have been defined as the lower envelope to the set of distortion 
curves corresponding to all encoder-decoder pairs. 


Il. THE LOWER BOUND 


Upper and lower bounds to the performance curve have been 
derived.* We present the lower bound in the first part of this paper, 
and the upper bound in Sections XI through XVII. Most of our 
effort concerning the lower bound was directed toward finding infor- 
mation about the rate of approach of the performance curve to its 
asymptote. In particular, we tried to relate the source and channel 
statistics, as well as the method of encoding that is used, to the rate 
of approach of d(n) to de. 


H 


dc ome do 


Fig. 1 — The rate distortion curve for 8. 
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Fig. 2 — The performance curve for § and @. 


Concerning this rate of approach, several interesting situations 
are known to exist. For one, there are some source-channel pairs for 
which the minimum attainable transmission distortion is independent 
of the encoding block length, with the consequence that it is possible 
to attain the distortion level dg with a coding block length of one. 
One example of such a pair is a binary symmetric source (equally 
likely binary letters with d(i,j) = 1 — 8,;, 7,7 = 1,2) used with a 
binary symmetric channel, where the optimum encoder is a direct 
connection. Another example is a gaussian source used with an addi- 
tive gaussian noise channel, where the optimum encoder is simply 
an amplifier.° 

When the source-channel pair is such that the minimum attainable 
distortion is independent of the coding block length we shall say 
that the source and channel are “matched.” For the more common 
situation wherein the minimum attainable transmission distortion 
decreases with increasing encoding block length to asymptotically 
approach the distortion level dg, we say that there is a “mismatch” 
between the source and channel, and suggest as a measure of this 
mismatch the “slowness” of the approach of the distortion to dg. 

Another interesting situation occurs when there is a choice of 
using one of several channels of different capacity. Although the 
channel of highest capacity would be the best choice when one is 
willing to use infinite block length coding, it might not be the best 
choice with finite length coding. This could easily happen if the high 
capacity channel were very much more mismatched to the source 
than some lower capacity channel. 
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HI. SYSTEM MODEL 


Figure 3 is a detailed illustration of the transmission system that we 
work with. The source 8 produces a sequence of letters o = o,, a, 

- ,@, , each chosen from the alphabet W = {w,, --- , wz}, which is 
mapped by the encoder into a sequence of channel input letters — = &, , 
f,-*:, &, each a member of X = {2,, ++: , ex}. The channel then 
transforms the channel input word é~ into a sequence of channel output 
letters n = 1, 42, °** » %, Which are members of Y = {y,, --: , yz}, 
and n in turn is decoded by the receiver into a sequence € = ¢, 
fo, +++, ¢, of letters from the decoding space Z = {z,, --- , zy}. 

The source and channel are both assumed to be constant and memory- 
less; therefore, successive events on each are independent and governed 
by the same probability distributions. In particular we have 


pal) = TT p(w" 


n 


Duitl¥ | x) = II ponien(y” | x"), 


m=) 


n nN USES OF n 
SOURCE THE CHANNEL DECODING 
OUTPUTS LETTERS 











Fig. 3— Block diagram of the encoding and decoding. 
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where the superscript on w”, x”, y” is used to denote the m’th letter 
in the n-letter words w, x, y respectively, and is not to be confused with 
the particular letters w,, , 7, , and y,, in the alphabets W, X, and Y. 
The subscripts on the probability distribution are hereafter dropped 
whenever no confusion will occur. 

The distortion in the system when the source word w is transmitted 
but received as z is taken to be the normalized sum of the n letter 
distortions, or 


aw, 2) = =D dw", 2"). ) 

Finally, although we have set up the problem so that a sequence 
of nm source letters is transmitted as a sequence of n channel letters, 
different block lengths at the source output and channel input can be 
allowed by considering a new source and channel that are products 
of the original ones, with the order of each product adjusted to obtain 
the desired block length ratio n,/ne. 


IV. THE SPHERE PACKING ARGUMENT 


A generalization of the sphere-packing concept is used to derive 
the lower bound. We assume the coding block length is n and derive 
a bound conditioned on the event that a particular source word w has 
occurred at the source output. We further assume that the channel 
input word x is used to transmit w, but delay the selection of x until 
the end of the derivation when the result is optimized over all possible 
choices. The total lower bound to distortion is found by averaging this 
conditioned lower bound over all source words in W”. The asymptotic 
form of this bound is studied in detail and from it a measure of mis- 
match between the source and channel is defined. 

The idea involved can be described with the following simple, but 
poor, bound which is subsequently improved. Remembering that the 
source word w is assumed transmitted by the channel input word x, 
we list all possible channel output words, y, ordered}in decreasing 
conditional probability p(y | x), and pair with each the decoder output 
word z(y) to which it is decoded by the optimum decoder. The resulting 
(conditional) distortion, 


aw) = 2 pty |x) alw, 209), @) 


is seen to equal the sum of conditional probability-distortion products 
on this list, If the set of distortion values that appear on this list is 
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now rearranged (with the list of conditional probabilities fixed) to 
be ordered according to increasing distortion values, the resulting 
sum of conditional probability-distortion products must be smaller 
than, or at most equal to, the sum in equation 2. It therefore provides 
a lower bound. 

The improved lower bound uses the same sort of orderings and re- 
arrangements but includes a probability function, f(y), in the ordering 
of the channel output words. This function is defined over the set of 
channel output words, Y”, and is later chosen to optimize the result. 
The channel output words are now ordered according to increasing 
values of the information difference I(x, y) = (1/n) In [f(y)/p(y | x)] 
and each is again paired with the decoder output word z(y) to which 
it is decoded by the optimum decoder. 

The rearrangement of decoder output words is also slightly different. 
To describe this rearrangement we visualize each channel output word, 
y, as “occupying” an interval of width f(y) along the line [0, 1]. The 
decoder output word, z(y), that is paired with a particular channel 
output word y is also viewed as occupying the same region along [0, 1] 
as y, but, because any particular word z, might be the decoding result of 
several channel output words, the region along [0, 1] occupied by z, 
could be a set of separated intervals. The rearrangement of decoder 
output words is this time a rearrangement of occupancies in [0, 1] 
toward the desired configuration wherein the decoder words are ordered 
in increasing distortion along this line, and each occupies the same 
total width in [0, 1] as it did before the ordering. Thus two monotone 
nondecreasing functions can be defined along the line [0, 1]; one, I(h), 
giving the information difference J(x, y) at the point h,O S$ h S 1, and 
the other, d(h), giving the distortion d(w, z) at h. The first theorem 
presents a lower bound to the single word distortion in terms of these 
two functions. 


Theorem 1: The average transmission distortion, d(w), conditioned on 
the occurrence of the source word w and its transmission using the channel 
input word x, satisfies 


dei) = i ‘ae dh. (3) 


Proof: Figure 4 is used to help prove the inequality. The distortion 
resulting from optimum decoding is given by equation 2; the con- 
ditional probability-distortion products on the previous list before 
rearrangement of the decoder output words. For convenience this is 


eOGy) _ PLYPX) 


---------- 4 


G(22) 9&3)” 9a gizs 
d(w,z) = d’th) d(h) 
(a) (b) 


Fig. 4— The geometry for theorem 1. 
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rewritten here as 


aw) =F atw, acyy| PL lig) (4) 


which can be seen equal to the “volume” in Fig. 4a enclosed by the 
two “amplitude functions” d’ and p/f and the “width measure” f. 

The rearrangement of the decoder output words to obtain the mono- 
tone function d(h) from d’(h) can be accomplished by a sequence of 
interchanges of the following type. We consider any two points in 
0 Sh S 1, say h, and A, , for which d’(h,) S d’(hy) and p/f(h.) S 
p/f(hi). If we consider an interval Ah around each point in which 
both amplitude functions are single valued and interchange amplitude 
values of d’ in the two intervals, we effect a volume transformation 
that decreases (or leaves unchanged) the total volume since 


initial volume—final volume 


= Ee ; (hi) + d’(he) 2 ao | Ah 


_ |g Pp np \P 
E (he) / (hi) + d’(hi) f ang | Ah 


= [d'(h) — aan] (hy) — ; ang | Ah 


= 0. 


Volume interchanges of this type are repeated until the desired 
monotonic function d(h) is obtained. The resulting volume configura- 
tion is then as shown in Fig. 4b. As each interchange of Ah width 
volumes decreases the total volume, or leaves it unchanged, the total 
volume in Fig. 4b is certainly no larger than that in Fig. 4a. We need 
now only notice that p/f(h) = exp—nI(h) to recognize that the 
integral in equation 3 is equal to the volume in Fig. 4b, and, there- 
fore, to establish the inequality claimed in the theorem. 

To be sure, the construction in Fig. 4b, and the calculation of the 
lower bound in equation 2 requires some knowledge of the structure 
of the optimum decoder. Fortunately, this knowledge is minimal; it is 
only the total width along [0, 1] occupied by each member, z, of the 
decoding space Z”. We refer to this occupancy as the “size” of the 
decoding set for z and denote it by g(z). 

From the construction of the lower bound volume in Fig. 4b, we see 
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that 
g(z) = > i(y) 


where Y(z) is the set of channel output words that are decoded into z 
by the optimum decoder. Indeed, if we assume unique decoding by the 
optimum decoder we have 


Yo) = | D iw = Dro) 1, 


or that g(z) is also a probability function. Even this function, though, 
is unknown in the general case or at least is impractical to calculate. 
The idea of the lower bound development, therefore, is to retain this 
unknown probability function for the present and subsequently replace 
it with another such function which minimizes the final lower bound 
expression. Within this step an approximation involving the form of 
g(z) is required which is detailed in Section 6.2. 


V. FURTHER EVALUATION OF THE LOWER BOUND IN THEOREM 1 


The integral in equation 3 can be simplified if we suppress the inter- 
mediate variable h and relate the variables d and J directly. The pairings 
of d and J through a common value of h, d(h) = I(h), does not by itself 
define a function because several different values of d could be paired 
with a given value of J, and vice versa. However, we will use the prop- 
erties that exist among these pairs to define a distortion function d(I) 
which has the property that for any J, the dependent variable d is at 
least as small as the smallest d(h) among the pairs that have [(h) = TI. 

To do this, we reinterpret the monotone nondecreasing functions 
d(h) and I(h). First, we view the distortion d(w, z) as a random variable 
on Z” governed by g(z). Its cumulative distribution function 


G@= Dd gfz) (5) 


d(w,z)sd 
is then seen to be the “‘inverse” of d(h). (Strictly speaking, the inverse 
of a staircase function does not exist, so the term inverse is used here 
only as an aid in relating d(h) and G(d) pictorially.) In a similar way 
we also view the information difference J(x, y) as a random variable 
on Y” governed by f(y). Its cumulative distribution function is given by 


FDS » 1), (6) 


I(x,y) sr 
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or the “inverse” of I(h). The desired function d(I) can now be defined 
in terms of G(d) and F,(Z) by relating to any information difference 
value J the distortion value that satisfies 


FAI") = G@). (7) 


The following geometric interpretation of d(I) might be helpful. If 
each size, or “volume,” g(z) of the decoding sets is successively placed 
about the volume g(z,) of the decoded word with minimum distortion 
d(w, Z,), and each size, or “volume,” f(y) of the channel output words 
successively placed about the volume j(y,) of the channel output word 
with minimum information difference I(x, y,), the total volume in- 
cluded by a point in the first construction at a distortion “radius” 
d is G(d) and that included by a point in the second construction at an 
information difference “radius” J is F,(I). The function d(J) then gives 
(except for edge effects) the correspondence between the radii that 
include the same volume in both geometrical constructions. Figure 
5a illustrates the construction of d(J) through the chain I — F\(I~) = 
G(d) > d. 

It is convenient at this point to introduce a second random variable 
of information difference; one which is governed by p(y | x) rather than 
f(y). Its cumulative distribution function is 


FQ) = 2d, ply |x). (8) 


I(x,y) $I 
To distinguish the two information difference variables, we will 
denote by J, the variable that has the distribution function in equa- 
tion 6 and by J, the variable that has the distribution function in 
equation 8. 

We are now in a position to rewrite the bound in Theorem 1 in 
terms of functions that involve only d and I. The distortion function 
d(I) has been constructed to lower bound all d(h) with I(h) = I, 
thus we can replace d(h) in equation 3 with d[I(h)]. As this substitu- 
tion replaces d(h) with a distortion function that is single valued 
over subintervals of [0,1] in which J is a constant, we can perform 
the integration in equation 3 by simply multiplying the integrand in 
each such constant I interval by the interval width, dF,(Z), and 
summing. Therefore, we can continue the inequality in equation 3 
with 


a) = ["” a exp (— nl) dF, 


Imi 
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Fy (1) 4 G(d) 










UPPER BOUND 
To G(d) 


LOWER BOUND 
TO F, (I) 


Fig. 5 — The construction of (a) d(J) and (b) dz(J). 
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which, upon using p(y | x) = exp (—7nIJ)f(y), establishes the lower bound 
in the next theorem. 


Theorem 2: The average transmission distortion, d(w), conditioned on 
the occurrence of the source word w and its transmission using the channel 
input word xX, satisfies 


aw) = f "UD aFAD. (9) 


VI. AN ESTIMATE OF THE FUNCTION d(J) 


6.1 The Random Variables I, and I, 


To obtain an estimate of d(J) we require an estimate of the two dis- 
tribution functions, G(d) and F,(I), from which d(Z) was defined. We 
first focus on F,(J) and the random variable J, . Since the lower bounds 
in Theorems 1 and 2 can be derived for any choice of f(y), we choose 
a form of f(y) that simplifies the following arguments. We specify that 
f(y) factors as 


19) = IL 10"). (10) 


One consequence of this assumed form is that the information difference 
I(x, y) is given as a sum of n letter information differences: 


a! _ fy") 5 om 

I(x, y) p> ne" ls) - =e) iG as (11) 

Among these n letter information ade however, there are 

different types, depending on the corresponding transmitted letter 

x” in x. To separate these, we introduce the vector c to denote the letter 

composition of the channel input word x, letting ¢ = ¢,, ¢, ++: , Cx 

when there are nc, appearances of the letter x, in x, nc, appearances 

of x, in x, and so on. Thus we can write the information difference in 
equation 10 as 


nNCk 


I(x, y) = a ty » Ti, (12) 


in which J,, is used to denote the information difference between the 
rth appearance of the letter x, in x and the corresponding letter in y. 
The interpretation of the 7,, as letter information difference random 
variables on Y governed by the letter probability function f(y) can 
now be seen to be consistent with the previous interpretation of J, 
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as a word information difference random variable on Y” governed by 
f(y). Using the abbreviations 


f(y:) fi 
p(yr |x) = Der, 


the probability distribution function of J,, can be written as 


Prin,| in Z| =f.; lsrsn,; 18SkSK. (18) 
kl 
What this has accomplished is to cast J, as the sum of n independent 
random variables, a step that enables us to use large number laws to 
estimate F,(J).*°-* 

In an almost identical way, the random variable J, can be cast as a 
sum of m independent random variables. This can be done if we as- 
sociate with the variable I, the probability distribution function 


Pa in Z| = Di} 1 Ses ne? l1sksk (14) 
kl 

instead of that in equation 13. With this distribution the word informa- 
tion difference variable I(x, y) in equation 12 can be seen to be governed 
by the probability function p(y | x), therefore, it is equal to the random 
variable I, . 


6.2 The Random Variable d 


In the work so far, the function g(z) is that probability function 
induced on Z” by f(y) through the optimum decoder function and cannot, 
therefore, be freely chosen once f(y) is chosen. On the other hand its 
precise calculation from the optimum decoder is impractical. The only 
alternative is to retain the unknown function g(z) in the lower bound 
expressions and to minimize the final lower bound to distortion over 
all possible probability functions on Z”. Since g(z) is one such probability 
function the inequality in the lower bound is continued. Unfortunately, 
when this is done it cannot, in general, be shown that the function which 
minimizes the lower bound factors into n letter probabilities, a form 
which we were permitted to assume for f(y). However, to proceed 
beyond the bounds in Theorems 1 and 2, it is necessary to approximate 
this g(z) by such a product, as in 


g(z) = 0 g(e"). (15) 
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The necessity for an approximation of this type is, of course, because of 
the requirement that an estimate be made for the distribution function 
G(d). The assumed form for g(z) in equation 15, will again allow us to 
use large number laws to obtain this estimate. 

More specifically, the assumed product form for g(z) allows us to 
cast the word distortion random variable d(w, z) as a sum of n inde- 
pendent letter variables. This is done in the following way. Among the 
letter distortions d(w”, 2”) that sum to the total word distortion there 
are H. different types, corresponding to each of the different letters 
w;, 1S 7S H, that appear in the source word w. 

If the composition of this word is q = q1, q2, °°: Qa, that is, if 
there are nq, appearances of w, in w, ng2 appearances of w, , and so on, 
the normalized word distortion can be written as 


nat 


it 

dw, 2) => De. (16) 
N G=1 r=1 

In this expression D;, is used to denote the distortion between the 

rth appearance of the letter w; in w and the corresponding letter in 

z. Equation 15 now allows the interpretation of the D;, as independent 

random variables, having the probability distributions 


Py,,.(dis) = 935 lsrsn@, l<isdH (17) 
dw; , 24) = d,; 
g (24) = 9i» 


with the result that G(d) is an n-fold convolution of elementary dis- 
tribution functions for which there exist many estimating forms.’°~** 

We realize that the approximation in equation 15 is not entirely 
satisfactory because it eliminates nonproduct probability functions from 
the minimization of the lower bound and, as far as we know, one of 
these functions could provide the minimization. However, there is 
good reason to believe that this approximation does not significantly 
affect the bound when n is reasonably large. For example, in the next 
several sections we derive a lower bound to distortion that uses the 
product from in equation 15. For this bound the required minimization 
over all probability functions g(z) is reduced to one over all J dimen- 
sional vectors g. It can be shown that if in the limit as n becomes large, 
the product form requirement for g(z) is relaxed, and the minimization 
of this lower bound is again made over all probability functions g(z), 
then the optimizing function g,(z) still has the product form. 

Even more significant is the asymptotic form of the lower bound that 
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is derived using equation 15. We later show that it is only the final 
value of the minimizing decoder set size vector g,(n = «) that affects 
both the asymptote of the lower bound, dz , and the next lowest order 
term, which is one proportional to 1/n . Values of the minimizing vector 
for finite n, g,(n < ©), affect only terms of o(1/n). 

Further, it can be shown that a similar conclusion is reached even 
if the independence property assumed over letters in equation 15 is 
generalized to be over blocks of length r, that is if 


g(z) = I g(z'”) 


ym 


Bp Bee 4. PPT ereeek 5 =r 


When g(z) is assumed to have this form, the minimization of the lower 
bound over all decoder set sizes is a minimization over all probability 
functions g(z’) on Z’. The conclusion that can be made from the bound 
derived using this assumption is that it is again only the value of the 
minimizing decoder set size function at n = ©, g,(z’, ©), that in- 
fluences both the asymptote and the term proportional to 1/n. And, 
at n = o, the minimizing decoder set size function on Z’, g,(z’, ©), 
factors into a product of single letter probability functions on Z. When 
this solution is substituted in the bound (that uses r = 1) the asymptotic 
form is the same for every choice of the constant r. Only lower order 
terms differ for different values of r. 

There is one situation in which the assumed product form in equation 
15 does not represent an approximation. That is the case of a doubly 
uniform source, which is a source that has a uniform probability dis- 
tribution over its letters and has a distortion matrix in which each row 
and column is the respective permutation of another row and column. 
For such a source it has been shown’ that the probability distribution 
g(Z) which minimizes the lower bound in Theorem 1 is uniform for all 
n, thus has the factorability property in equation 15. 


6.3 A Lower Bound to d(I) 


We now seek an approximation to d(J) that we can substitute in 
equation 9 and preserve the inequality. A safe approximation to d(Z) 
can be had if, instead of equating F,(J~) to G(d) as in equation 7, we 
equate a lower bound estimate of G(d) to an upper bound estimate of 
F,(I-). Figure 5b illustrates this construction. The result is another 
distortion function, d,(I), that satisfies 


dz(I) S d(l) (18) 
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which can be used in equation 9 to obtain 


aw) = ["” dy arn, (19) 


min 


Since the random variable J, is a normalized sum of n independent 
random variables, its variance is proportional to 1/n. Consequently, 
when n becomes large the distribution function F,(1) has almost all 
of its “rise’’ around the mean of J,, which we denote by I. In this 
region, I = I, d = d(1), the values of both distribution functions G(d) 
and F',(Z) are exponentially small. Therefore, the bounds to the tails of 
distribution functions’°-” are applicable to the estimation of G(d) and 
F,() in this region. Indeed, it was with the intended use of these 
powerful bounds that we formed both the distortion and information 
difference random variables as sums of n independent letter random 
variables. All of the bounds, though, are parametric in form and allow 
only a parametric representation of d,(J). 

We have elsewhere® applied strict upper and lower bounds to G(d) 
and F,(I), respectively, to obtain the function d,(I). However, when 
these bounds are used, the resulting total lower bound to transmis- 
sion distortion, though applicable for all block lengths n, does not 
reveal the correct asymptotic behavior inherent to the sphere-packing 
procedure which has been used. (This happens because the strict 
bounds to G(d) and F,(I) themselves do not have the correct asymp- 
totic form to large n.) 

In addition, the resulting lower bound to the total distortion is 
very complex and so does not provide much insight into the factors 
which affect the rate of approach of the performance curve to its 
asymptote. For these reasons, we instead use Shannon’s“ and Gal- 
lager’s!? asymptotic forms for the tails of distribution functions to 
bound G(d) and F(Z). These are: 


Gd) s __- a" atl Ay(n, » | exp n[u(s) — su’(s)] — (20a) 


=e 
u’(s) = d (20b) 
with 


0<dsEd|q = > dw, z| compw = q)g(z), 
oe 
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and 


FZ) 2 laters + AL, 0 | exp n[y() — ty’) (21a) 


v)=1 (21b) 
with 


Lista < I = KU, | c) = D T(x, y | comp X = c)f(y). 
y 


In these bounds, Ay(n, s) and Az(n, t) are sums of rather difficult 
integrals but each has been shown by Shannon and Gallager to be 


(a) 

/n 

Also within the previous bounds, we have used p(s) to denote the 
semi-invariant moment generating function of the variable d, 


u(s) 


De: qiui(s) 
aa (22) 


H J 
D4: In Di 9: exp sd; ; 


and y(t) to denote the semi-invariant moment generating function 
of the variable J, 


y(t) 


I 


K 
De cxyx(t) 
k=1 


K L 
Die Do Be 
k=1 l=] 


To guarantee the boundedness of y(t), we restrict the vector f to 
have nonzero components. This does not affect the resulting bound. 
(Actually, these bounds strictly apply only when the variables d and 
I are nonlattice. For lattice variables the corresponding bounds’’'” 
have in their coefficient a quantity A which does not change continu- 
ously with the argument of the distribution function, and cannot be 
used within our derivation. One alternative would be to decrease one 
assigned letter distortion d(w, z) by an arbitrarily small irrational 
number, and similarly, to change two transition probabilities on the 
channel in a way consistent with a lower bound to distortion. The new 
variables d’ and J’ would then be nonlattice.) 


(23) 
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The desired distortion function, dz(Z), can now be defined by 
equating the two bounds in equations 20 and 21. It can be con- 
structed through the chain: I- -t -s -d in which the superscript 
could now be dropped since the bound to F;(Z) is continuous in I. 
It is important to notice that the region of validity of the previous 
two bounds allows definition of the function d,(I) only in a subin- 
terval [Z.,1,] of [Zmin, Imax] with 


lia 2d, S21 Si, SG; |e), Ee @i: 


Outside the interval [J,, I,] we can define d;(J) equal to zero and 
write the lower bound in equation 19 as 


aw) = [ d.(1) aF.),. (24) 


We are now faced with the difficult integration of a doubly para- 
metric expression. Rather than integrate directly, we use the following 
Taylor series expansion for d,(Z) within [I,, Ih]: 
dQ) = dd.) + d{MI — 1) +4474 - 1? +3 ay) — D® 
TS(dz) 
with I, < I’ < I,. (The indicated derivatives can be shown to exist 
within the restricted interval [Z,, Iy].) Using this form for dz,(J) 
within equation 24 we see that if the region of integration were [Imin, 
Imax] instead of [I,, Ip], the resulting form would be a sum of central 


moments of Ip with the Taylor series derivatives as coefficients. To 
restore this form we rewrite equation 24 as 


il 


Imax Ig 


aw) = | res 


Imax 
[ rsd) ar. — @8) 
Imin Imin To 
In these integrals, the lower limit Imin is finite since f; is assumed 
nonzero for all 1, and Ima, can be taken as the largest finite value of 
In fi/Px1 since this is the largest value of J for which the random 
variable Iz has nonzero probability. Therefore the function TS(dz) is 
bounded in [Imin, Ja] and [I,, Imax] with the result that the last two 
integrals in equation 25 are exponentially small in n. The first in- 
tegral in this equation has the desired form, involving the central 
moments of Io: 


[> rst avy = a + asec — 1 + 3 axe - Y 
+ § a I)ELT — YY}. 
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In the above equation the second term is zero since we have specified 
that I is the expected value of J, , and the last term can be shown to 
be proportional to (1/n)*. This establishes the result in the next theorem. 


Theorem 8: The conditional average transmission distortion, d(w), satis- 


fies 
aw) 2 dil) + 4 a'4@ var (1) + 04). (26) 


Compared with the last low order term, the variance of J2 is propor- 
tional to 1/n. 

The simplicity in the form of the last result is due to the use of 
the Taylor series expansion which not only has allowed us to evaluate 
a difficult integral, but has provided a natural way of separating the 
important terms in the lower bound to distortion. 


6.4 The Evaluation of dy(1) and d4/(1) 


We shall denote by s, and ¢, the parameter values consistent with 
I = J in equations 20 and 21. Since 


K L 
y(—1) = > Do Pat In f./par , 
=1 l=1 
which is seen equal to Z(I,) = I, we can conclude that tj = —1. We 
also note here for future use that 
y(—1) = 0. 
The first of the two significant terms in equation 26 is immediate: 
d,(I) = u'(s,). 


Next, elementary differentiation of the parametric expressions in 
equations 20 and 21 provides 





= rf 
az(I) = - 


to,8o 


|= 


iv) 
° 


and 








“(1) = . Ee - ar - 


2 | _—e | | 
Soly’(—1) — siu’"(s,) 





ll 
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Finally, the variance of Jz is seen from equation 12 to equal 


I 


3 le 
oo 
i] 
aw 


Mr 


Var (I>) c, Var (1;,) 


ae 


Sle 


ro asf 3 a(n hi/puy = er In h/pu) | 


ie pal a! 


With the substitution of these terms in equation 26 we obtain the 
result in the next theorem. 


_ 


Theorem 4: The conditional average transmission distortion, d(w), satis- 

















fies 

dw) 2 2ns, ma bnccy 7 | 3 o(2) (27) 
in whach s, 1s given by 

u(s.) — su’) = T- PIED 4 of). (28) 


It remains to average this lower bound over the entire source space 


Ww". 
VII. THE AVERAGE OVER THE SOURCE SPACE 


To average the lower bound in Theorem 4 over the source space W” 
we assume that channel input words of equal composition are used for 
all transmissions. It has been shown*® that this assumption does not 
affect the asymptotic form of the lower bound to distortion. We first 
notice that the lower bound in Theorem 4 depends upon the source 
word w only through its composition q which enters in the form of 
u(s). Therefore, we can average d(w) over the set of all compositions 
for w rather than over all of W”. As all composition vectors for w are 
probability vectors, they are all located on an H — 1 dimensional 
hyperplane, termed the composition space Q”, which is in the “first 
quadrant” of R” and intersects each axis g; at one. Not all points in 
Q” are possible word compositions for any particular n. For example, 
with H = 2 and n = 2 there are only three possible compositions. But 
as n increases, the points in Q” that are source word compositions be- 
come quite dense. 

The probability that any particular composition q occurs at the 
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source output is 


P(q) = N(q) I] pe (29) 


in which N(q) is the number of distinct source sequences with the 
composition q and the product is the probability of each. The number 
N(q) is given by 
n} 
NG) Ss 
tt (nq:) ! 


We now write the total average source distortion, d(S), as 


d(s)= >)  d(q)P(q) 


all source 
compositions 


which we can lower bound by substituting for d(q) the lower bound 
found in Theorem 4. Rather than write out the entire expression each 
time we want to use it, we let d,(q) denote the right side of equation 27, 
thus have 

d(s)2 2) d.(q)P(q). (30) 


all source 
compositions 


Viewed as a function over Q”, P(q) is a set of impulses. This allows 
us to consider the distortion function d,(q) a continuous function over 
all Q”, rather than a function defined only at composition points, and 
to write 


ais) =f --- | d@P@ aa. (31) 
Qu 

Again because the expression for d,(q) in equations 27 and 28 is para- 
metric, we use a Taylor series expansion of this distortion function to 
evaluate the integral. The point chosen for the expansion is p, the 
probability vector characterizing the source. The reason for this choice 
is that the components of this vector are the means of the coordinates 
of q when the latter are considered (dependent) random variables 
governed by P(q). The Taylor series then contains terms of the type 
(Qi — pi), (Qi — pi)(Q; — pj), and so on, which, when averaged by 
P(q), are the central moments of the components of q. 

Using the notation d‘,;,(p) to indicate the partial derivative of d,(q) 
with the respect to gq; evaluated at q = p (and similarly for higher 
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order derivatives), we have 
H 
d(s)2 be x / | auto) + 20 dias — po) 


+ 4X disease — pas — vs) | 
$ py» avile)(a: — pia; — pi)(Q — p») |P@ dq (32) 


with ¢ e Q”. The central moments of the components of q can be found 
to be 


E(q; — p:) = 0, 
El(a: — pila; — pi] = 5 (pi 5:3 — DiD;) 


E{(qa: — pai — Di — Dx)! 


(33) 


1 2 
= (1) [p; Oi5n ~— PiP; Oxi a PiPr Ou; me DiP: 55x + 20 .:p;Drl; 


which, when substituted in equation 32, yields 


d(8) = dz(p) ae pS dy.(p)p: — > di!:;(p)ppi) + o(4). (34) 


Referring to bah 27 we see that the required second derivative 
need only be taken of p’(s,) as the two 1/n coefficients allow other 
terms to be absorbed in those of 0(1/n). The differentiation is lengthy, 
but straightforward, and yields 





0 —_ u:(S.) 
0g; (au (s, ? q) > S, 
and 
ee Cee ee 
dq: 0q;° °°? ~ seu’ "(8. , P) 
where 


0; = ws(8.) — 8oui(s.)- 
Upon substitution of these derivatives in equation 34 we obtain 


d(8) 2 di(p) — Onsiu’(s.) le [du pif; > Pip;9:6,) + o(2) 


= d,(p) — ee Var (6) + o(2 ) 
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With the final substitution of the expression for d,(p) in equation 
27 we have the result in the next theorem. 


Theorem 5: The average transmission distortion of the source 8, when 
used with the channel @, is lower bounded by 


as) & we. 9) — 5 [MEP EO _ 114 of!) oss) 


ms, L Som’’(8 5 P) 





im which 8, is given by 





ae ad eae 

uo.) — sw’. p) =T— pins D4 (4). Go 
In this bound the vector g is, for the reasons previously stated, that 
which minimizes the bound, the vector f is chosen to maximize the 
bound in order to obtain the tightest bound, and the vector c is chosen 
to minimize the bound, that is to use the best composition for the 
channel input code words. As formidable as the derivations of these 
extremum appear, we show in the next section that the work involved in 
establishing the asymptotic behavior of the bound is actually quite 

simple. | 
It should be mentioned that these results do not apply when 
y’'(—1) = 0, which is a situation that occurs when channel € is noise- 
less, for the reason that we have divided by and canceled factors equal 
to y’’(—1). The result for this case is derived separately in Section IX. 


VIII. THE ASYMPTOTE AND RATE OF APPROACH 


8.1 The Asymptote 


When 7 becomes large, the limiting form of the bound in Theorem 
5 1s: 


din(S) 2 pu’ (So, P) 
in which s, satisfies 


US, ’ p) — 8,u (8, , p) = I 
with 


& 


os K 
I = * Ck 
k=1 


The vectors g, f, and c must now be chosen to provide the extremum 
indicated just after Theorem 5. Since only f and ¢ enter in the expression 


Pri In fi/Pur . 
=1 


~ 
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for I, we can minimize d,,($) with respect to g for a constant I. This 
minimization provides precisely the expression’ for the rate-distortion 
curve for $ at the information rate J. It is further shown in the same 
reference that the value of g which provides the minimization is the 
vector that describes the output statistics on the test channel for § 
at the point (dz, I) on the rate-distortion curve. 

The maximization and minimization of d,,($) with f and c, respec- 
tively, can be accomplished by finding the same extremum of I. The 
resulting values for f and c are the output and input probabilities, 
respectively, on channel © when it is being used to capacity and the 
value of I at the extremum point is —C. Therefore, the resulting ex- 
pression for the asymptote of the lower bound is 


d(8) 2 min p(s, , p) = de (37) 


with s, satisfying 
LAS, ? p) pom Sou (So ? p) = aCe (38) 


This agrees with what we know to te the correct asymptote of the per- 
formance curve.” 


8.2 The Rate of Approach to the Asymptote 


Since the lower bound in equations 35 and 36 is parametric in s and 
includes the vectors f, c, and g, which when optimally chosen are func- 
tions of n, the complete asymptotic dependence of this lower bound upon 
the block length n is not obvious. To establish this dependence, we 
first find the full derivative of the lower bound in Theorem 5 with respect 
to n and then integrate the result between n and infinity. 

We first simplify the procedure slightly by using our freedom to 
choose f by setting this vector equal to its value at n = ~;f(). This 
does not change the end result. We also drop the terms of o(1/n) in 
equations 35 and 36, because they clearly do not affect the asymptotic 
result. Denoting the right side of equation 35 by d, and using the chain 
rule several times, we can write the desired derivative as 


dds, _ (242) (az) ds (22x) dg; 
dn \an c.2,8 ai Os ‘edn Xu O9;/ oun: dn 


c,n,s 


od) do, 
T dX (2! clxk dn 
Gm,s 
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with 





ds _ (2s) ( as ) dg; (25. de, 
dn \dn/,.. + > OG;/ gue; AN + > OG) axe ON 


Zen 


The notations outside each parentheses indicate the variables which 
are momentarily held constant. Substitution yields: 


dds, _ (2%) is (24x) (2) 
dn an C,g,8 Os c,Z,n on g,¢ 
2 x ( O8 /e,g\O9;/ oxx; = 09;/ ox; 1 dn 


ad.) (28.) (242) |. 
” DX  ( O8 | o.¢, \OCK/ cry a OCS euch ON 


n gn £,n,8 


The bracketed terms represent the respective partial derivatives 
of d, with respect to g; and c, with s removed from those quantities 
held constant. Since g(m) and c(n) are chosen for each value of n to 
minimize the lower bound d;, these partial derivatives must satisfy 


(%) 42.=0 1siss (39) 
09g; ony; 
ofa aE Ke. (40) 


(ee) ar Me 
gin 


This presumes that, at least for sufficiently high n, both g and c have 
only nonzero components. This is known to be true for c,"* which at 
nm = © equals the channel input probabilities that use the channel to 
capacity. 

The vector g, though, can at n = © havea zero component. For this 
case, if the approach of g(n) to g() is from within the composition 
space, that is, if the components of g(n < o) are nonzero, equation 
39 is correct as written for all finite n. If, however, the approach of 
g(n) to g() is along the boundary of the composition space, that is, 
having one or more components equal to zero for all n > N, then 
equation 39 can be written, not for alll S$ 7 S J, but only for the J’ 
nonzero components. Over the region (NV, ~) the other J — J’ zero com- 
ponents obviously can be treated as constants and not included in the 
differentiation process, thus excluded from the previous summations 
on j. We shall not attempt to deal with the only remaining possibility, 


854 THE BELL SYSTEM TECHNICAL JOURNAL, JULY-AUGUST 1968 


which has g(n) approaching g() such that it oscillates between vector 
values with all nonzero components and values with some zero com- 
ponents, since no example has been found exhibiting this behavior. 

We continue the derivation by substituting equations 39 and 40 
into the derivative of d, to obtain 


dd, _ (24) (2d) (22) 2 dg; sv den 
dn \dn/e.gs i Os | ¢.e.n\ON/g.c x = dn” De (41) 


Finally, since both g and c are probability vectors, the last two sums 
are equal to zero (this is true even when the first sum is only over the 
J’ nonzero components of g). It remains only to find the required 
partial derivatives from equations 35 and 36. These are given by: 


(24s) hyo ( ee 1) 
On/oe, 2n's\ sep’ : 


(Se) ca + 





2s) 1 vy’ 
9S ee OM, FAB eee 
(2 ce 2s SB 


whence substitution in equation 41 provides 


dd, 1 1 | (> =a) ne $n (1,) 

dn wW2|s|L\s'n" ! um sy’ + sy +0 n° (42) 

At this point, the vectors g, c and the parameter s are still functions 
of nm chosen to satisfy the prescribed minimizations of Equation 55 


and the parametric Equation 35. If, for large n, these functions are 
written as 


g(n) = g(o) + Ag(n) 
c(n) = c(«) + Ac(n) 
s(n) = s() + As(n), 


the delta terms can be extracted from the first term in Equation 42. 
Since each has limit zero for large n, they can, together with the (1/n)’ 
coefficient, be absorbed into the terms of o(1/n”). Thus, in equation 42, 
we can use for g, c, and s their final values: g(), c(), and s(). 
Simple integration of equation 42 between n and infinity, and the 
use of the known final value of d,(n), d,(#) = de, provides the final 
lower bound to distortion. We again point out that the derivation has 
included the approximation that g(z) factors as in equation 15. 


I 
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Theorem 6: <A lower bound to the minimum attainable transmission 
distortion in a system that includes the source 8 and the channel @ is given by 


iG = G [ 7 1) = ra - | sf o() (43) 


2n|s|L\su 
in which 
C = capacity of e 
dg = the distortion at R = C on the rate-distortion curve for § 
us) = D0 a: In DY 9; exp sdi 
WO) = Dee In DS fe pa 
k 
q = p, the source output probabilities 
g = the output probabilities on the test channel for § at (dc, C) 
c, f = the input and output probabilities on @ when it is used to 
capacity 
t= —1 
s satisfies u — sp’ = —C. 


The lower bound in equation 43 is seen to approach its limit alge- 
braically as a/n. Since (w—1) is at least as large as In w for any w 
and o? and p” are variances, hence nonnegative, the coefficient a can- 
not be negative. But it can in special cases equal zero. The conditions 
for this are 


Q 
bo 

II 
2S 


conditions that are necessarily met when the source and channel are 
perfectly matched; that is, when d(S) = dg for all n. 

They do not, however, constitute a sufficient condition for matching 
since the low order correction terms in equation 43 could still be non- 
zero. For the more common situations wherein a is nonzero, the form 
of the lower bound suggests that the larger the value of a, the longer the 
coding block length must be to obtain a tolerable level of distortion, 
de + A. In turn, the more complex the modulator and demodulator 
must become. These relations all suggest the utility of the coefficient 
a as a measure of mismatch between the source $ and the channel @; 
the larger the value of a, the slower the approach of the lower bound 
to its asymptote and the greater the mismatch between source and 
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channel. Section X gives several numerical examples illustrating dif- 
ferent types of mismatch. 


IX. THE SPECIAL CASE OF A NOISELESS CHANNEL 


As we have stated, Theorem 5 cannot be applied when € is noiseless 
because factors equal to y’’(—1) have been canceled within its deriva- 
tion and, for a noiseless channel, y’’(—1) equals zero. We return to 
the lower bound in equation 3 which is still valid. If the vector f is 
chosen uniform over Y”, we see from the definition of a noiseless channel 
(L” outputs) and the definition of information difference in Section IV 
that I(x, y) is equal to In (1/Z) for the output y, that has p(y,/x) = 1, 
and is infinite for all other outputs. Since f(y,) = L7", e°”™ is nonzero 
only inO S$ h S L, where it is equal to L”. Therefore, equation 3 can 
be written as 


aw) = 1" [ ath) dh. (44) 


We remember that the distribution function G(d) is the “inverse” 
function to d(h) and write 


d(L—") 


d(w) = L" 7 [L-" — G(@)] dd 


which can be continued, with any dz < d(L™), by 


da 
aw) = 1" | (L™ — G@) aa. 
9) 
Upon dividing the region of integration into two parts, 0 < d,; < de, 


and using the monotonicity of G(d), we have 


dee =P ata) i. ” G@ da. (45) 


A further lower bound results if we use an upper bound to G(d) in 
each of the last two terms. In particular, we use the asymptotic 
bound in equation 20 which we denote here by 


G(d) S H(n, s) exp n[u(s) — su’(s)] (46) 
ui(s) = d. 
We now set dz equal to »’ (s,) with s, given by 


H(n, s,) exp n[u(s,) —s.u’(s,)] = L* = e*°. (47) 
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The fact that G(de) < L™ guarantees the inequality d. < d(L") 
which we have already used. The second term in equation 45 can be 
shown to be exponentially small in n whenever d; < dz; therefore, we 
also impose this inequality. To bound the last term in the same equa- 
tion we use the well known Chernov bound inequality: 


exp n[u(s) —sy’(s)] S exp n[u(s,) —s,d] 
w(s) =d 
together with equations 46 and 47 to obtain 


da 


da 
inf G(d) dd S$ De"*°” / e"*"" dd 
ds 


dy 


with 


H(n, s) 


D = 
disddz Hn, So) 





The resulting bound for d(w), therefore, is 


aw) = we) +2 — expns,w'(s) — ay) + of4): 


If d, is chosen in a way to approach p’(s,) with increasing n, this 
bound becomes: 


aw) & u's) + = [1 + 0] (48) 


in which s, satisfies equation 47, rewritten here as 


u(s,) — su!(s.) = —C — +n Ht, s,) 


(49) 


ea os Lala Ahi: 


The remaining steps, averaging over the source space and minimizing 
the resulting bound over all choices of g (we continue to use the approxi- 
mation in Equation 15), are identical in procedure to those previously 
used. We state only the result. 


Theorem 7: The minimum attainable transmission distortion of the 
source &, when used with a noiseless channel of capacity C, satisfies 
1 Inn 





(1 + o(1)] (50) 
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in which s, satisfies 
w(S. ) Pp) pas Sott’ (So ) Pp) = —C. (51) 
We see by comparing equations 43 and 50 that while the lower 
bound to distortion with a noisy channel approaches its asymptote, 
dg, a8 1/n, the lower bound to distortion with a noiseless channel ap- 
proaches dg only as (In n)/n. These bounds are not inconsistent 
since for a noiseless channel the variance y” is zero with the result 
that the coefficient of 1/n in equation 48 is infinite. A similar limiting 
statement is also true. If a noisy channel is made to approach a noise- 
less one by reducing the noisy transition probabilities toward zero, 
at the same time keeping the channel capacity constant by appro- 
priately reducing either the channel input alphabet size or the channel 
dimensionality, the coefficient of the 1/n term increases and is un- 
bounded. These results therefore suggest than when there is a choice 
between using a noiseless channel or a noisy one of equal capacity, 
the noisy channel is always the better choice. And, inasmuch as we 
are using the coefficient of the 1/n term to measure the source-chan- 
nel mismatch, the noiseless channel represents the worst possible 
match to any source. 


X. EXAMPLES 


In the first three examples, we illustrate different types of source- 
channel mismatch and calculate the effect of each upon the coefficient 
a in the lower bound of equation 48. Each of these examples tends to 
strengthen the suggestion in the lower bound result that this coef- 
ficient is a measure of source-channel mismatch since it increases 
monotonically as the channel is perturbed away from the matching 
channel. 

Because the channel statistics influence only the first two terms of 
a, we use in these examples a doubly uniform source for which the o? 
term equals zero. To further isolate the relative matching properties 
of the source-channel pairs, we keep constant the channel capacity 
per source output, C, as the channel is varied. Thus the distortion 
per source component has the same asymptote, dg, for all source- 
channel pairs and the only difference in the lower bound curves, at 
least asymptotically, is in the coefficient a. 


Example 1 


This example illustrates a dimensionality, or coding block length, 
mismatch between a source and channel. We take for the source $ 
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the m,’th product of a binary symmetric source, defined by p = (3, §) 
and di; = doo = 0, dig = dy, = 1. For the channel € we take the m,’th 
product of a binary symmetric channel, each component ©; having a 
crossover probability ». The channel capacity per source component 
is m,./m, times the capacity of ©; and is kept constant as m,/m, is 
varied by appropriately changing the crossover probabilities p. 

Figure 6 shows the dependence of a upon m,/m, . When comparing 
the two curves in this figure, notice that the ordinate has been normalized 
by de . We know that for m,/m, = 1 the source and channel are pre- 
cisely matched and this is indicated in the figure by the value a = 0 
at that point. Above this point a increases monotonically in m,/m, and 
can be shown to have the asymptotic form a ~ k(m,/m,)*. Below 
m,/m, = 1, a also becomes unbounded as m,/m, approaches the ratio 
that requires each component channel ©; be noiseless. This is not 
inconsistent with the noiseless channel result (equation 50) which 
indicated that the rate of approach of the distortion to de was not as 
a/n but as (In n)/n. 


Example 2 


Here we do not change the relative dimensionality, only the form 
of the channel. The source is a binary symmetric source and the 
channel a binary nonsymmetric channel of varying asymmetry. The 
crossover probabilities are again changed in a way that does not vary 
the capacity. We see in Fig. 7 that a is rather insensitive to small 
perturbations from a binary symmetric channel and in most cases is 
affected less by this type of mismatch than a dimensionality mis- 
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Fig. 6— The mismatch between a binary symmetric source and a binary 
symmetric channel of different dimensionality. 
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Fig. 7— The mismatch between a binary symmetric source and a binary 
nonsymmetric channel. 


match. A similar result obtains if the source is also allowed to be 
nonsymmetric. 


Example 3 


For this example we use a binary symmetric source and a discrete 
channel which models the m orthogonal signal modulator used in the 
next example. The channel has m inputs and m outputs and has 
from each input one transition of probability 1 — (m—1)p and m — 1 
transitions of probability ». The numbers m and 7p are varied to- 
gether in such a way that the capacity of the channel remains con- 
stant. We see in Fig. 8 that the mismatch coefficient a is much higher 
when the binary symmetric source is used with this channel than 
when it is used with that product binary symmetric channel of 
Example 1 which has available an input alphabet of equal size. The 
comparison can be made on Figures 6 and 8 at points for which 
m,/m; = logem. 


Example 4 


In this, the last example, we include in the system a continuous 
channel which is to be used by a discrete source with a discrete modu- 
lator. Now, as the modulator changes the discrete channel extracted 
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from the actual channel changes and both its capacity and its match- 
ing characteristics change. It turns out that both properties are not 
necessarily optimized for the same modulator structure and, therefore, 
one must strike a compromise (influenced by the block length of 
interest) between a modulator design that minimizes the asymptote 
dg and maximizes the rate of approach to dg. 

To illustrate this we assume the channel to be a band-limited chan- 
nel with additive white gaussian noise in the allowed bandwidth. 
During the interval (0,7), the discrete modulator is constrained to 
transmit one of m orthogonal signals in each of B bauds and alto- 
gether an energy no greater than H. To model the bandwidth con- 
straint the mB product is assumed constant, but m and B can other- 
wise be varied to optimize the system. Thus the equivalent discrete 
channel is the B’th product of the m input doubly uniform channel 
of Example 3. The source to be transmitted is a binary symmetric 
source with an output rate of M, digits every T seconds. 

In Fig. 9 we show the minimum attainable distortion dg (deter- 
mined through the channel capacity) and the mismatch coefficient 
a as a function of m. For the values shown in figure, we see that 
while dg is minimized at m = 15, the coefficient a is then quite large. 
And, around m = 22, where a = 0, the minimum distortion dg is 
higher than that which can be realized with a smaller m. The con- 
clusion from this is that the modulator should be designed with m = 
15 (to maximize capacity and minimize dg) only when one is willing 
to use very long coding block lengths. For shorter block lengths, a 
larger value of m, and a corresponding smaller value of a, could result 
in a smaller average distortion even with the larger value of dg. For 


0.016 











Fig. 8—- The mismatch between a binary symmetric source and the m-orthog- 
onal signal channel. 
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E/No = 4630 
Ms =2570 BITS / TSEC 
mB= 12,000 











0 4 8 12 16 - 20 24 28 32 


Fig. 9— The influence of the modulator design in Example 4 on the minimum 
attainable distortion and the mismatch coefficient. 


this example a compromise design with m about 19 would probably 
be best over a range of intermediate block lengths. 

It is interesting to notice in this example that the coefficient a can 
be zero even when the source and channel are not matched. This is 
consistent with our previous interpretation of a = 0 as a necessary 
but not sufficient condition for matching. We remember that the 
coefficient a being zero does not imply that the lower bound in equation 
43 is precisely dg for all n. There are several other terms of o(1/n) 
in this equation that have not been specified which are not neces- 
sarily zero when a = 0. 


XI. THE UPPER BOUND 


Now let us present an upper bound to the minimum attainable 
transmission distortion as a function of the coding block length. As 
with the lower bound, the upper bound approaches the asymptote do, 
but only as [(In n)/n]*. The reason for the difference, we believe, 
is that within the upper bound derivation the transmitting signal set 
was restricted to contain at most M = e"° members, a restriction 
that was not necessary to impose in the lower bound. We also present 
an upper bound to the transmission distortion with a noiseless chan- 
nel. This bound does agree, asymptotically, with the corresponding 
lower bound. 


XII. THE RANDOM CODING ARGUMENT 


All of the upper bound derivations in this paper use random coding 
arguments. That is, we do not explicitly find the encoder and decoder 
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which, when used with S$ and @, provide the distortion in the upper 
bound, but show that one pair does exist. More specifically, we con- 
struct a set of encoder-decoder pairs with a probabilistic rule according 
to which each system is selected to be used. This defines an ensemble 
of transmission systems, each with its own distortion, corresponding 
to all possible coding selections. What we calculate is a bound to the 
average distortion of this ensemble. Clearly, this provides an upper 
bound to the minimum distortion in the ensemble, hence to the mini- 
mum attainable distortion in any system that includes § and @. 


12.1 The Construction of the Ensemble 


We denote the set of points on the rate distortion curve for & by 
(dz , R) and assume the capacity of © to be C. We first choose any point 
(d*, R*) on the rate-distortion curve below (do, , C) and design the 
code in such a way that the ensemble average distortion approaches 
d* with increasing block length. We know this to be possible from 
Shannon’s results.” Moreover, we expect, since the situation is some- 
what analogous to a channel coding problem with R* < C, that the 
distortion can be made to approach d* exponentially fast. The point 
(d*, R*) is subsequently varied to obtain the best result at any particular 
block length of interest. 

For any selection of (d*, R*), we then choose the number of signal 
points, M = e"”, used to transmit $. To attain a transmission distortion 
level d*, we certainly must have the number of signal points large 
enough to represent the source to at least within d*, and this requires 
that R be greater than R*. We also require that FR be less than C’ so 
that in the limit as n becomes large, we are guaranteed correct decoding 
among the signal points at the receiver. Therefore we have 


R*<R<C (52) 


and, for the corresponding values of distortion on the rate-distortion 
curve, 


dine = d* > dr > de ° (53) 


The value of R can also later be chosen to optimize the result. 

An ensemble of codes of length n is constructed for each selection of 
R and R*. We use the probability distribution p(x, z) to generate the 
ensemble by picking, according to p(x, z), M independent pairs (x, z) 
from X"Z". Thus we have a set of codes containing all possible mappings 
of the integers 1 through M into pairs of n-letter words (x, z), or (JK)"” 
codes in total. (We continue to use here the notation defined in the 
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earlier part of the paper dealing with the lower bound.) Each of these 
codes has the associated probability 


Pr (code) ;= II p(x; , Z;). 


Any probability function p(x, z) could be used to obtain an upper bound, 
but we use a distribution that factors into p(x)g(z); therefore, in the 
ensemble, each set of M decoded words, @,, is independent of each 
set of M channel input words, 6, . Thus we can write 


Pr (code) = p(4, , 9) = p(@:)p(02) = IT p,) I] g(z,). 


Further, we use for p(x) and g(z) the product forms 
IIp@") and IT o@") 


in which the letter probability distribution p(x) is that which yields 
a mutual information C on © and the letter probability distribution 
g(z) is that which gives the output statistics on the test channel for 
S at the point (d*, R*) on the rate-distortion curve. 

The encoding and decoding is done as follows: In every ensemble 
member there is a list 6, of allowed decoded words and a list 6, of usable 
channel input words. When a source output w occurs, the encoder scans 
6, and chooses any member Z, in this list for which 


d(w, Zz.) S d*. (54) 


If there are none, the encoder chooses any member at all on the list 
6,, say Z,. Since the lists are chosen together, there corresponds to 
Z, Or Z, a particular x in @,, and this word is used to transmit w. The 
decoder uses a maximum likelihood decision rule to decode y into a 
member of 62 , which is then associated, through the pairings among the 
two lists, with a member z in 6, . The resulting distortion, by definition, 
is d(w, Z). 


12.2 The Ensemble Average Distortion 

Kach member, 0, of the ensemble is a complete transmission system 
in itself, and has an average transmission distortion dependent upon 
the codes, 6, and 62, that are used. This average distortion, which is 
an average over all possible source and channel events, is equal to 


d(6) = d(, , ) = 22 pw) 2 py | x) d(w, z). 
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The ensemble average distortion is obtained by averaging d(6,, 62) 
over all choices of 6, and #2, hence 


(d(6))av = a p(w) 2 [du Do wy | x) dw, z)p(6:)p(8)]. (55) 


We next separate the events w, 6, , 92, and y into two sets: (2) those 
quadruples for which ezther there does not exist a z in 6, satisfying 
equation 54 or the received word y is decoded into a member of @, 
different from the transmitted word x(w), and (iz) its complement. 
For quadruples in set one, the distortion d(w, z) is surely upper bounded 
by dmx, the maximum entry in || d(w, z) ||. For those in the second 
set, we use equation 54 and the fact that the decoder returns us through 
x(w) to z, to upper bound the distortion by d*. Therefore, if the char- 
acteristic function ® is used to indicate the quadruples in set one, we 
can upper bound the ensemble average with 


(d(0))av S >- p(w) a Do Dd ply | x)p(,)p(6.)[d*(1 — &) + diax®] 


= d* = (dria _ d*) Pr (). (56) 


Finally, we use the union bound to upper bound Pr(@) and the ensemble 
average distortion, (d(6)),., to upper bound the minimum attainable 
transmission distortion, d($), and obtain the result in the next theorem. 


Theorem 8: The minimum attainable transmission distortion of the 
source 8, when used with the channel @, satisfies 


d(s) S d* + (dnaz—d*)[Pr(4’z, in 6,) + Pr(channel error)] (57) 


in which 4’ means “there does not exist,” d* is any distortion greater 
than d, , and R (a variable in the bracketed terms) is any rate in the 
interval R* < R < C. The bound is a function of n through the quantity 
in the brackets. 

The last term in the brackets, the probability of error on the channel, 
has been approximated by many people, but we will use Gallager’s 
bound” 


Pre) se" (58) 


in which H() is a positive monotonically increasing function of the 
difference C — R. The next section is devoted to the evaluation of the 
first term in the brackets, which is the probability that the source 
word w and the list 6, are such that equation 54 is not satisfied for 
any Zin 0, 
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XIII. THE PROBABILITY OF FAILURE AT THE ENCODER 


We say that failure occurs at the encoder, for the source output w, 
when each of the M allowed decoded words on list 6, are at a distortion 
d(w, z) from w greater than d*. Because each of the M words in @, is 
_ selected independently, we can write the total probability of this failure 
as 


Pr (4 ’z, in 64) 


> p(w) Pr (4’z, in 4, | w) 
Ww? (59) 


ll 


2, p(w)[1 — Pr (zs d(w, z) S d* | w)]". 


The last probability is seen equal to the distribution function of the 
distortion random variable described in Section 6.2 and defined by 
equations 16 and 17. In these equations q = qi, dz, °°: » Qu is the 
composition vector of the source word w, and D,, is the letter distortion 
random variable between the r’th appearance of the letter w; in w and 
the corresponding letter in z. 

We again notice that the distribution function of d(w, z) depends 
only upon the composition q of w. Thus we are able to perform the 
average over W” in equation 59 as one over all possible compositions 
of w. All possible compositions can be represented as points in the H — 1 
dimensional hyperplane within the first quadrant of R” which intersects 
each axis q; at one. This hyperplane is called the composition space 
Q”. The probability of any composition point is equal to the product 
of the number of different source words having this composition and 
the probability of each, therefore, we have 


P(q) = N(q) IT pi 


agi 
= = —— JIn™. 


Interpreting P(q) as an impulse function over Q” we can now write 
equation 59 as 


Pr(a/z,ina) = {--- f P@-Ga*| aI" aa. — 0) 


QH 


To continue the inequality in equation 57, we require a lower 
bound to G(d*). For our present purpose, Fano’s lower bound’? is 
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sufficient: 
G(d*{q) 2 K(n, q) exp n[u(s, q) — su’(s, a)] (61) 
=K(n, q) exp — nR(d*, q) 
in which 
u’(s, q) = d* (62) 
0<d* S E@|q) (63) 


H J 
u(s) = > q;: In > g; exp sdi; 


and K(n, q) is a rather complex function of q and n that goes to’ zero 
algebraically in with increasing n. Its precise form is otherwise un- 
important in the following derivation. (The bound in equation 61 can 
still be used for points q that violate equation 63 if one uses the value 
of s = 0 rather than that which satisfies equation 62.) We can therefore 
write 


Pr (4’z, in 4) S / tee [r@u — K(n, q) exp — nR(d*,q)]”” nk da. 
Qu 


(64) 


The next step is to divide the composition space Q¥ into two dis- 
joint subspaces, Q and Q’, that are defined by 


Q 
Q’ 
with § any positive number satisfying R* < R — 8. The idea behind 
this separation is illustrated in Fig. 10. The bracketed term in the 
integrand of equation 64 has the form [1 — exp (— nA) ]*? "8 which 
approaches zero with increasing 2 when A < B, and one when A > B. 


In the first region, which, except for the 8, corresponds to the set Q, 
we shall use the upper bound 


{q: R(d*,q) < R — $} (65) 


(q: R(d*,q) 2 BR — 4} (66) 


[1 — exp (— nA)]™°"" S exp [— exp n(B — A)] (67) 
and in the second region, corresponding to Q’, the (poorer) bound 


[l — exp (— nA)]™"? <1. (68) 
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THE COMPOSITION 
77 PLANE QU 


~----H R(d* q)< nn q) 2 R-dé--------- 
THE SETQ THE SET Q’ 


Fig. 10 — The division of the composition plane Q” into the sets Q and Q’. 
The use of these bounds in equation 64 results in 
Pr (4 ’z, in @;) 


s[--- [ P@ exp [-Kt, @) expnlk — R@*, al} dq 


Q 


+]. [ P@a aq 


IIA 


| ia i P(q) exp [—K(n, qe] dq + Pr (Q’) 


lA 


exp [—K(n)e”’] + Pr (Q’) (69) 


in which K(n) denotes the minimum of K(n, q) over Q. The first term 
in this upper bound is a double exponential in » which will turn out 
to be unimportant. Thus it remains to evaluate Pr (Q’). 

We shall use what we call the hypercube method to upperbound 
Pr(Q’). Although the resulting bound is not as tight as others that 
could be derived (see, for example, the maximum probability point 
method in Ref. 8), it has the advantage of being simpler both to derive 
and to use and, in addition, does not seriously degrade the final bound 
to transmission distortion. What is done is to enclose the set Q’ by 
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another set Q/ that has a relatively simple configuration, and to upper 
bound Pr(Q’) by Pr(Q!). 


We construct in R” a hypercube of dimension 2u centered at q = p, 


K" = {q:p; -uSqa: Spi + u}, 


and intersect with it the composition space Q”. The intersection forms 
a “solid” Q, 


Q, = Q” a Ke 


which contains vertices of the form q, = qi,, G2.) ‘°° Qu», With the 
components, of course, summing to one. When H is even, g;, equals 
either p; + u or p; — u, and when # is odd, q;, has the same values 
with the addition of one component equal to p;. The vertices of Q, 
are joined by straight lines. 

At this point we use the fact that Q is a convex set,* that is, for 
0 <2 S11, dq. + ( — A)q, is a member of Q whenever both q, and 
q, are. This property ensures us that whenever the vertices of Q, are 
in the set Q, the entire set Q,; is in Q, 


Q € Q, 
with the consequence that 
Pr(Q’) S Pr(Q:). (70) 


The remaining step is to bound the total probability of the set Q/. 
Because this probability equals the probability that any of the dependent 
events q; © [p; — u, p; + ul] occurs, we can use the union bound to 
upper bound Pr(Q‘) by the sum of the individual probabilities. Thus 


H 
Pr (Qi) S$ DY Prlqs <p: — ul + Prlgs > vp: + ul. 
i=1 
These quantities can be further upper bounded by a simple applica- 


tion of Chernov bounds. This has been done for us in Ref. 16, page 
102, where the result found is, in our notation, 


H 
Peg) S26 ae (71) 


in which 
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and 
d; =p, —wu for X; 
=p; tu for Y,. 


In these bounds, the hypercube dimension 2u should be maximized, 
to obtain the tightest bound, subject only to the constraint that all 
vertices q, be in region Q, that is, that they satisfy equation 65. 

The bound in equation 71 can be simplified still further by writing 


Pr(Q!) S 2H exp [—n min (X,, Y,)] 
=K, exp — nH ,(R). (72) 


Indeed, it can be shown,® that there are two, and not 2H, candidates 
for the minimizing quantity in the exponent. 


XIV. THE SET OF UPPER BOUNDS 
Combining equations 57, 58, 69, and 72, we have the following result: 


Theorem 9: The minimum attainable transmission distortion of the 
source &, when used with the channel C, satisfies 


(8) S$ d* + (diaz —a*) {exp [—K(n)e™] 
+ K, exp [—nE,(R)] + exp [—nE(R)]} (73) 
for any d* and R that satisfy 
Umax 2 A* > dr > de (74) 
R*<R<C. (75) 


The freedom provided by equations 74 and 75 can be used to generate 
a set of upper bounds, corresponding to all possible choices of d* and 
R, the properties of which depend upon those of the two exponential 
functions in equation 73. It has been shown elsewhere® that E,(R) 
is a positive monotone increasing function of the difference R — R*, 
that E,(R*) = E4(R*) = 0, and that E’’/(R*) ¥ 0. Comparing these 
with the corresponding properties of the channel reliability function:”° 
E(R) a positive monotone increasing function of the difference C — R, 
E(C) = E'(C) = 0, E’(C) # 0; we see that the two functions are quite 
similar. Typically, their curves would look like those in Fig. 11. 

With these curves, we can examine the behavior of the set of 
bounds -in Theorem 9. As shown in Fig. 12, when d* is chosen much 
larger than dc, the nonzero slope of the rate-distortion curve allows 
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E(R), Es(R)—> 





Fig. 11— Typical behavior of E,(R) and E(R) near their zero value. 


a choice of R that can make both the differences C — R and R — k* 
large. In turn, the exponents Z,(R) and E(R) in equation 73 are large 
and the exponential terms decay very rapidly with n. But for this 
choice, the asymptote d* is much greater than the level dg, which we 
know can be approached. 

On the other hand, if we choose d* only slightly greater than dc, 
we have an upper bound with an asymptote that is nearly dg, but 
now the differences C — R and R — R&*, and therefore the exponents 
E,(R) and E(R), are much smaller and the rate of approach to the 
asymptote d* is correspondingly slower. Thus, in the selection of 
d* and FR there is a trade-off between a small asymptotic value and 
a fast rate of approach. This is illustrated in Fig. 13 in which we 
show a set of curves obtained from the upper bound expressions in 
equation 73. The best compromise for any value of n is given by the 


Fig. 12 — The rate-distortion curve for § illustrating the peauODS among the 
parameters in Theorem 9, 
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*R)— 


dy(n,d 





ito 


Fig. 13 —— The upper bound in Theorem 9 with three different values for d* 
and f. 


lower envelope to the entire set of bounds in equation 73, therefore 
we have 


Theorem 10: The minimum attainable transmission distortion of the 
source 8, when used with the channel ©, satisfies 


d(s) S min dy(n, d*, R) = dy(8) (76) 
d*,R 


in which the function dy(n, d*, R) is used to denote the right side of 
equation 73. 

In the next section we study the asymptotic behavior of the lower 
envelope. At this point, though, we wish to include an important 
conclusion that can be established from the set of upper bounds 
in equation 73. Each individual bound indicates that, in a system 
where the distortion level dg is attainable in the limit, if one would 
tolerate a distortion d* = dg + A, this level could be approached ex- 
ponentially fast as the coding block length is increased. 

Actually, a much stronger statement is possible. Since the distor- 
tion curve for d* = dg + 4A approaches this level in the limit, it 
must cross, at some finite n, the level dg + A. Because both curves are 
for the same source and channel, this proves that the distortion level 
do + A is not only approachable exponentially fast, it is in fact at- 
tainable with a finite coding block length. This is true for any A > 0, 
no matter how small. 
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XV. THE ASYMPTOTIC BEHAVIOR OF THE UPPER BOUND 


From the previous discussion it is clear that as n increases, the 
optimum value of d* must approach dg and therefore that the ex- 
ponents £,(R) and H(R) must approach zero. For this reason we 
use the Taylor series representations for these functions at R* and 
C in equations 73 and 76, respectively, and obtain 


dy(S) & min {d* + (diyax — a*) 
d*,R 


-(K, exp — nb,(R — R*)? + exp — nb.(C — R)*]} (77) 


with b, = 3H/’(R*) and b, = 4£H’(C). In using the Taylor series for 
E(R) and E,(R) we have dropped the cubic terms since both E’’’(C) 
and E’’'(R*) are finite and C — R and R — R* are o(1). The double 
exponential term involving 6 is also dropped since it can be shown to 
contribute nothing important in the asymptotic bound. 

We next avoid the minimization on R by choosing that value of R 
which equates the two exponents: 


b(R — R*)’ = b(C — R)’. (78) 


While this selection of R is nonoptimum for finite n, it can be shown 
that it asymptotically approaches R,»;, and that it does not affect 
the asymptotic behavior of the upper bound. This particular choice 
of R& allows us to combine the two exponential terms in equation 77. 
If we start with equation 78 and the obvious equality 


(C — R) + (Rk — R*) =C — R*, 
we can establish 


Vb1 


V/ be 
Rk — Rk) = —=—— = C — R®), 80 
( ) We. Vi, | ) (80) 


which further allows us to write the two exponents in terms of the 
common difference C—R*. 

Next, we wish to express the difference C—R* in terms of the 
difference dg—d*. Taylor’s formula with remainder is again used: 


R(d*) = R(dc) + R’(de)(d* — de) + o(d* — de) 


874 THE BELL SYSTEM TECHNICAL JOURNAL, JULY-AUGUST 1968 


or 


C — R* = —R'(dc)(d* — dc) — o(d* — de) (81) 


—s,(d* = dc) = o(d* ra dc). 
In the last equation we have used the fact that the slope of the rate 
distortion curve at the point (dg, C) is equal to the value of s which 
satisfies p(s) — sp’(s) = — C.%8 

Finally, we substitute equations 79, 80, and 81 into equation 77, 


subtract do from both sides of this last equation, and change the 
minimizing variable to d* —d,> to obtain 


d(s) — dg S min [x + (A — 2x)K, exp — Bnz’] (82) 
in which x = d* — dg, A = dimx — dc, Ko = K, + 1 = 2H + 1, and 
B = b,b:82/(W/by + Vb2)’. 


We next find the asymptotic behavior of the lower envelope in equa- 
tion 82. 

If x is considered the parameter, each function of n in the set 
f(x, n) starts at f(z, 0) = «+ (A — x) Ke and decreases exponentially 


to f(x, 0) = x. For any two parameter values, 7, and 2, with 1 > 
X_ we have 


f(a, ,0) — f(%2 , 0) (1 — K2)(t. — 22) 


—2H [f(a , <0 ) = f(x2, co )]. 
Consequently, any two curves must cross as in Fig. 14. 

It follows that the parameter x,(n), which identifies the minimum 
of f(x, n,) at the value n = n,, must change with n. Since this param- 
eter is the solution of 


I 


{ita n) = 0, 


we have 


exp (nBx}) — K, = 2nK.Bxu.(A — 2p). (83) 


Figure 15 shows the required graphical solution which clearly always 
exists. The substitution of x(n) in f(x, n) specifies the single func- 
tion of n, f[xz.(n), n], which is the desired lower envelope. Un- 
fortunately, an explicit solution is not possible for x,(n), nor for 
flxo(n), n], but we can obtain bounds to both that are adequate for 
our purposes. 
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Xot+(A-xX2)Ke Q 


Ly (A-2x;) Koq 





.—> 


Fig. 14— Two members of the family of curves: f(zn) = « + (A — 2)K; 
exp(—Bnz’). 


From the graphical solution in Fig. 15, we see that any conjec- 
tured solution, x,?, must be too large if, in equation 83, the left side 
exceeds the right and too small if the reverse is true. This criterion 
could also be used on a trial functional solution x,(n)?. Now, if the 
left side of equation 83 is functionally stronger in n than the right, 
we know that our trial solution x,(n)? is too strong in n. Again the 
reverse is also true. 

After several guesses we are led to the trial functional solution 
%o(n) = [a(In n)/Bn]* with which the right side of equation 83 is 
greater than the left for a < 1%, and the reverse is true for a > 1%. 






2nK2zBx(A-x) 


Fig. 15 — The graphical solution of equation 83. 
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This determines the highest order term of v,(n) and we can write 


(22) + oa) sam < G+ 982) + od. 
It follows that 


feean), nj = (4) (22) + oa 


and, since the lower envelope is smaller than any individual f(z, n), 
that 


ten), n) = i] (5) (M2), ] = (SB) (2a tom. ee 


Although only an upper bound to f(z, n) is required, both upper 
and lower bounds were found to show that the method used to obtain 
the desired lower envelope provides asymptotically tight results. Con- 
tinuing the inequality in equation 82 by that in equation 84 provides 
our final upper bound to transmission distortion. 


Theorem 11: The minimum attainable transmission distortion of the 
source &, when used with the channel ©, is upper bounded by 


as) < de + o(™)'1 + oft) (85) 


1 1 
Slat oe 


in which 





(Gs) -@ 
2B (2)? [s 
b, = BEV(R® = C) 
b. = 3b"(C). 


For a fixed source 8, we see from this theorem that the coefficient 
b is smallest when § is used with that channel (among those of equal 
capacity) for which the constant b, is largest. In the same way, the 
coefficient b is seen to be a decreasing function of b, when the channel 
is fixed. Since the constant b, is independent of the source and 6, in- 
dependent of the channel, our upper bound does not provide an in- 
dicator of matching between the source and channel as we obtained in 
the lower bound. This was actually expected since here we were forced 
to separate the source and channel with an interface containing at 
most e”® points. 

The coefficient b,, though, has an interesting significance. It is 
equal to one-half the derivative 4’(R* = C) which can be thought to 
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indicate how fast the boundary of Q’ initially moves away from p with 
increasing R. In turn, this indicates, in a reciprocal manner, the neces- 
sary rate of change of the rate required to handle source words with 
compositions just around p, which are just less than typical. Thus, we 
can think of the coefficient b, as a type of “stretch factor”’® for the 
source. 

When the result in equation 85 is compared with the lower bound to 
distortion, we see that the [(In )/n]’ rate of approach to dg is slower 
than the 1/n rate of approach of the lower bound. Mathematically, 
at least, the reason for the upper bound decreasing more slowly than 
(1/n)? is that, for small arguments, the lowest order term in the two 
exponents E(R) and E,(R) is quadratic. Their form for large n, exp 
—n(AR)’*, shows that values of AR larger than (1/n)? are required to 
have these terms go to zero with increasing n. Because the slope of 
the rate-distortion curve is nonzero, the corresponding values of dis- 
tortion difference (Ad) must also be larger than (1/n)*. 

There is reason to think that this type of exponential term, and the 
consequential [(In n) /n]* rate of approach to dz , is present in the upper 
bound because we have used threshold devices in the transmission 
system. One at the encoder leads to the first exponential term in equa- 
tion 73 (we again disregard the double exponential term). It uses the 
rule in equation 54 to choose, for each source word w, any decoder word 
z in list 6, at a distortion less than d*. When list 0, is lacking such an 
entry, any z at all on the list is chosen which, since the members of 
6, are chosen independently, is then independent of w. The resulting 
distortion in this circumstance is usually much greater than d*. In the 
next section we compare the performance of this encoder with another 
that does not use such a threshold and show that the source encoding 
alone need only contribute to a rate of approach to dg equal to (In n)/n. 

A second threshold operation in our system is at the channel decoder, 
but it is really dependent upon the coding of the entire system. It leads 
to the second exponential term in equation 73. To isolate its effect on 
the system performance, we assume that failure has not occurred at the 
encoder, that is, there does exist a z on 6, with d(w, z) S$ d*. Now if 
the channel decoder makes no error, we are assured that the resulting 
distortion is less than d*. However, if an error is made, the believed 
channel input word x, is different from the actual word x; therefore the 
decoded word z, is different from z,. Moreover, since the lists 6, and 
6, are chosen independently, z, and z, are statistically independent. 
It follows that z, and w are also statistically independent, and in con- 
sequence that the distortion d(w, z,) is usually much greater than a*. 


878 THE BELL SYSTEM TECHNICAL JOURNAL, JULY-AUGUST 1968 


It is this threshold which, it is believed, cannot be eliminated when 
the signal space is constrained to contain at most M = e”° points, even 
if the lists 6, and 6, are chosen dependently. A heuristic argument in 
Ref. 8 suggests that with such a constrained signal set, the transmission 
distortion can approach dg no more rapidly than as n7*. This, of course, 
is a slower rate of approach to de than the a/n rate of approach of the 
corresponding lower bound to distortion that was derived using a 
signal set not constrained in size. 


XVI. AN IMPROVED UPPER BOUND FOR NOISELESS CHANNELS 


For the special case of a noiseless channel, the previously derived 
upper bound can be improved. Since such a channel contains e? noise- 
less transitions, or “direct” paths, transmission of the encoder output 
is trivial and the communication problem is only one of source 
representation. For this representation we are allowed to choose, from 
an e® letter representation alphabet, one representation letter for 
every source output letter. Just as one is allowed n uses of the channel 
to transmit an n-letter source output, one is allowed an n-letter 
representation word to approximate an n-letter source word. 

We first state that if the threshold source encoder defined by equa- 
tion 54 is used in the ensemble of representation codes 4, of Section 
XII, the ensemble average representation error is very similar to the 
ensemble average transmission error derived in the previous sections. 
The only difference in the derivation is that the Pr(channel error) 
term is no longer present in equation 57, nor in any succeeding equa- 
tion, with the only result being that bz = © in equation 85. 

We note here that this particular result is valid only for sources 
that are not doubly-uniform, that is, having a uniform probability 
distribution and a distortion matrix in which all rows are permuta- 
tions of one row vector and all columns are permutations of one col- 
umn vector. The reason for this exclusion is that for doubly-uniform 
sources the exponential term in equation 73 involving E,(R) also 
vanishes, and the double exponential term involving 8, previously 
dropped as insignificant, now remains as the only term. It is instruc- 
tive to delay further evaluation of the bound in this case until after 
the following upper bound to representation distortion is derived. 


16.1 Optimum Source Encoder 


We now derive an upper bound to the source representation error 
when an optimum source encoder is used in place of the threshold 
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encoder of the previous section. The resulting upper bound will be 
seen to approach the asymptote, dg, as (In n)/n. This represents an 
improvement upon the best previously known upper bound to source 
representation distortion? which approached dg essentially as n-*%. 

The coding ensemble used here is very similar to the set of codes, 
6,, used in Section XII. But now the size of the set, M, is set equal to 
e"® for all n, rather than have it approach this size with increasing 
n. And, the probability with which each ensemble member is used, 


Pr (code) = p(@,:) = I] g(Z;); 


is now governed by that probability distribution g(z) equal to the output 
probability distribution of the test channel at the point (d¢, C) on 
the rate distortion curve for &. Within each ensemble member the 
encoder chooses, for any occurring source word w, that member z 
on 6, for which d(w, z) is minimum. Therefore, for each ensemble 
member the average distortion over all possible source events is 


dO) = 2 p(w)[ min d(w, z.)]. (86) 


Zieda 


The ensemble average distortion is given by 


(d(8))ev = 2, p(w) 2 p(6,)E min d(w, z,)]. (87) 


zied, 


The set of quantities d(w, z;) in equation 87 could be thought of as 
a set of M independent and identically distributed random variables, 
each conditioned on w and governed by the word probability distribu- 
tion g(z). The minimum of this set, duin(w), is then also a random 
variable, governed by the code probability distribution p(6,). The inner 
sum in equation 87 is, therefore, the expected value of d,in(w) and 
we can write 


(d(0))ov = vbw) /  d dFamistw(d | W) 


which, upon integration by parts, becomes 


(U(0))ov = Tepw) [CL = Peasate(d [wad (88) 


The conditional distortion random variables d(w, z;) are the same dis- 
tortion variables used in Section XIII. Since they depend only upon 
the composition of w, we can again perform the summation in equation 


880 THE BELL SYSTEM TECHNICAL JOURNAL, JULY-AUGUST 1968 


88 by integration over the composition space, thus 


(A0))un = foo f P@ aa [= Passel | dd (89) 


= f +++ [ PG) daldnin(a))e- (90) 
QH 
The inner integrand in equation 89 is the probability that all M 
points on 6, have a distortion d(w, z) from w greater than d. Using the 
independence property of the members of 6, , we can write this proba- 
bility as 


1 — Panini a(@|q) = [1 — Gd] q)]”. (91) 


It can be seen from equation 16 that the variance of the variable d is 
proportional to 1/n for every q. Therefore the function [1 — G(d | q)], 
which for every n decreases monotonically from one to zero, approaches, 
with increasing n, a negative step at the value of distortion d = E(d|q). 

The same is also true of [1 — G(d | q)]” which approaches a negative 
step at some lower value of distortion, d¢(q). This can be established 
using the following asymptotic upper and lower bounds to the dis- 
tribution function G(d | q) which are from Shannon” and Gallager”: 


h(n, q) exp —nR(d, q) S Gd|q) S H(m, q) exp —nR(d, q) (92) 
with 
R(d, q) = u(s,q) — sp’(s, q) (93) 
0<(s,q) =d S Ed|q) 


and in which h(n, q) and H(n, q) are algebraically small functions of n. 
Therefore, within the range 0 < d < E(d|q), the function in equation 
91 can be bounded by 


ade ye Sia ols (bate 
(94) 


which proves that [1 — G(d | q)]” must approach one when R(d, q) > C 
and zero when R(d, q) < C. That the function R(d, q) is monotone 
decreasing in d within 0 < d < E(d|q) now establishes the stated 
limiting step function form of [1 — G(d|q)]” with de(q) equal to the 
distortion value for which 


R[dc(q), aq] = C. (95) 
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The region of integration in equation 89 is thus conveniently divided 
into two parts: one over [0, d¢(q) + A] in which the integrand is upper- 
bounded by unity, and the other [de(q) + A, dinax] in which the integrand 
is upper-bounded by its value at the lower limit. The result is — 


(dmin(Q) aw S do(q) + A+ [dmae — de(q) — AJ[L — G(de(q) + A| aI” 
(96) 


which, with the use of the lower bound in equation 92, can be continued 
by 


(dinin(Q) aw S de(q) + A + [dmax — Ac(q) — A] 
-{1 — hexp [—nR(de(q) + 4, q)]}*?**. 
Equation 67 allows the further continuation of this bound by: 
(dmin(4) ae S do(q) + A + [dmax — dc(q) — Al 
-exp (—h exp {n[C — R(dc(q) + 4, aq)]}). (97) 


Again the monotone decreasing property of R(d, q) in d provides that 
the quantity C — R(dc(q) + A, q) is positive when A is positive and, 
therefore, that the last term in equation (97) is a decreasing double 
exponential in n. 

Equation 97 actually provides, for each q, a set of upper bounds to 
(dmin(q))av Very similar to the family of curves studied in Section XV. 
In the choice of the parameter A there is once again a trade-off between 
a small asymptote, de(q) + A, and a fast rate of approach. It should, 
in general, be chosen to optimize the bound at each n. Since we want an 
upper bound to (dmin(q))av that approaches d¢(q) with increasing n, 
the optimizing parameter A,(n) clearly must approach zero as 7 in- 
creases. But A,(n) must approach zero in a way that also allows the 
last term of equation 97 to vanish. . 

Since an asymptotic bound is our goal, we extract the essential be- 
havior of this term for small A by forming a Taylor series of R(d, q) 


at d = dc (q): 
C — R(dc(q) + A, q) = —AR’(dc(q), g) + ofA) 


—sAd + o(A). 


In this expression s is the parameter value in equation 93 when d equals 
d¢(q). Thus the lower envelope to the set of bounds in equation 97 
can be written, for the purpose of an asymptotic bound, as 


(dmia(Q))av S min {dce(q) + A + [duax — de(q) — A] exp (—he™°***)}. 
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The minimization is found using the same method used in Section XV. 
In this process, it is important to notice that Shannon’s coefficient 
h(n, q) in equation 92 is proportional to n=. The result is that the 
optimizing parameter satisfies 


iinn 


1 ) ma 
2 —sn 


1+ of) Am s ($46) B® 0 +o] 
and that (dnin(q))av satisfies 
dane $do@+($+-)B2 +o]. 8) 


Returning to equation 90, the ensemble average representation error 
therefore can be upper bounded by 


ae) sf f P@lacy + b+) 2] aa. 09 


—sn 


The above integral is evaluated in the same way similar averages 
were found for the lower bound. The bracketed quantity is expanded 
in a Taylor series about q = p and is truncated after three terms with 
a Lagrange remainder term. Upon integration of this expansion we find 


Inn 
—sn 





(d())w S dete) + (3 + «) 


+ Dela + (b+) 22 | na - po 


—sn 


a” 1 Inn 
als » 8g: 09; | eet - (3 a . en] Ua — pi)(q; — p;)] (100) 


sn 


with s, = s(p) and oe Q”. 
Using the following expected values in equation (100), 


E(q; — p:) = 0 
1 
El(q: — p:)(a; — p:)] = rs (p: 6s; — PsP), 


we have the following upper bound to the ensemble average distortion 
and, therefore, to the minimum attainable representation error. 


Theorem 12: The minimum attainable transmission distortion (rep- 
resentation distortion) of the source &, when used with a noiseless channel 
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of capacity C, is upper bounded by 


Inn 
—s,n 





as) sd + ($+) P20 + 0 (101) 


in which s, satisfies 
LS, } p) < Sou’ (So ) p) — —C. 


Except for the arbitrarily small positive «, the bound in equation 101 
agrees precisely with the asymptotic lower bound that we found earlier 
in this paper. 

We see by comparing equation 85 (with b, = © for the noiseless 
channel) and equation 101 that the replacement of the threshold source 
encoder with an optimum encoder increases the rate of approach to 
the asymptote from [(In n)/n]* to (In n)/n. To obtain some feeling 
for the reason for this improvement, we might think of the optimum 
encoder as a threshold encoder, but with a threshold that varies de- 
pending on the particular source output. Indeed, we used this step 
within the mathematics when we separated all events (equation 96) 
into two sets with the separation dependent upon the source word. In 
particular, for any source output word with composition q, we used 
a threshold, de(q) ++ A, just large enough so that for large n there is 
almost surely a representation word in @, that is acceptable. It does 
not require, as does the fixed threshold encoder, that the set of source 
words not meeting a fixed distortion level of d* have a total probability 
that goes to zero with n. This restriction is really more severe than one 
would think we need, since some of the source words w discarded by 
the fixed threshold encoder are just outside p, having characteristics 
just less than typical, for which some of the distortions d(w, z,) might 
be only marginally greater than any fixed d*. 


16.2 The Special Case of a Double Uniform Source 


There is one situation for which both source encoders provide a 
representation distortion that approaches the limit dg as (In n)/n. 
This is when the source $ is doubly-uniform. Since u(s, q) is independent 
of q for such a source, R(d*, q) in equation 61 is also independent q, 
with the result that the set Q’ in equation 66 is always empty. There- 
fore, Pr(Q’) = 0 in equation 69 and we have for the set of upper bounds 
to representation distortion, using threshold encoders: 


d(8) S d* + (dinax — d*) exp (—he"’). 
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In this bound we have used the lower bound in equation 92 rather 
than that in equation 61. It can now be shown, using precisely the 
same procedure as before, that this set of bounds approaches the 
limit dg as (In n) /n. 


XVII. SUMMARY 


We have presented upper and lower bounds to the minimum at- 
tainable transmission distortion of a source measured by a specified 
distortion measure. The bounds, which were derived for both noisy 
and noiseless channels, have all been shown to converge to the same 
level of distortion, dg, algebraically in the block length n. The quan- 
tity do is that level of distortion shown by Shannon to be the mini- 
mum attainable transmission distortion when the channel capacity is 
C and arbitrarily complex transmission methods are allowed. 

For noisy channels, the rate of approach of the lower bound to dg 
is as a/n and that of the upper bound as b[(In n/n)|%. The non- 
negative coefficients a and b are both functions of the statistics of the 
source and channel, but have different forms. The lower bound coef- 
ficient, a, interrelates these statistics in such a way as to suggest its 
utility as a measure of “mismatch” between the source and channel, 
the larger a, the slower the rate of approach of the bound to dg, and 
the larger the source-channel mismatch. This coefficient is, of course, 
necessarily equal to zero whenever the source and channel are per- 
fectly matched, that is, whenever the minimum attainable transmis- 
sion distortion is equal to dg for all block lengths, n. 

The coefficient b in the upper bound, though, does not present an 
indicator of source-channel mismatch. It is the sum of two terms 
which separately contain the source statistics and the channel sta- 
tistics. The cause of this separation is the interface between the 
source and channel that results from the use of a transmitting signal 
set constrained to contain at most e"° members, a constraint which 
we found necessary to introduce in the development of the bound. 

For noiseless channels, both the upper and lower bounds to the 
transmission distortion (or the source representation distortion) 
have the same form. They both have been shown to approach the 
asymptote dg as a, (In n) /n. 
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Some Considerations of Stability in Lossy 
Varactor Harmonic Generators 


By C. DRAGONE and V. K. PRABHU 
(Manuscript received March 4, 1968) 


Explicit expressions are derived for the scattering parameters which 
relate small-signal fluctuations in a lossy varactor harmonic generator 
of order N = 2", n an integer. The effect of losses on the stability of the 
mutiplier is then studied. The very important particular case is then ex- 
amined in which all the losses occur in the series resistance of the varactor 
diode, and it 1s shown that absolute stability is obtained provided the effi- 
ciency 1, of the multiplier <N~"*, because of the particular distribution 
of the losses at various carrier frequencies. Therefore, the conclusion 1s 
reached that in most cases of practical interest restrictions have to be placed 
on the available circuit configurations to prevent instability of the multiplier. 


I. INTRODUCTION 


A serious limitation to efficient wideband harmonic generation with 
varactor diodes is that instability in the multiplier might cause the 
generation of spurious tones.1 It is the purpose of this paper to study 
the effect of losses on stability of abrupt-junction varactor frequency 
multipliers of order N = 2” = 2, 4, and so on, with the minimum 
number of idlers. 

The type of instability considered here is the one discussed in Refs. 
2, 3, and 4. It produces undesired low-frequency fluctuations in the 
amplitude and phase of the output harmonic and is caused by the 
time-varying elastance of the varactor, which is potentially unstable 
with respect to phase perturbations. 

The stability conditions of lossless abrupt-junction varactor mul- 
tipliers have already been extensively discussed elsewhere in Refs. 
3 and 4. More precisely, these works have shown that, in the absence 
of any losses in the varactor diode, the frequency characteristics of 
the input, output, and idler circuits must satisfy certain restrictions 
in order that the multiplier be stable. The main objective of this 


887 
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paper is to determine the amount of loss that the multiplier must 
have in order to be absolutely stable, that is, stable for arbitrary 
linear passive input, output, and idler circuits. 

First we show that the over-all multiplier efficiency 7, can be ex- 
pressed as the product of the efficiencies of the input, output, and 
idler circuits; that is 


Mm =m XK ne X +°* Ov, 


where 7, represents the ratio of the power P, delivered to the output 
at carrier frequency Nw, to the power supplied by the input pump at 
carrier frequency w,. The partial efficiency 7, is the efficiency of the 
circuit at the carrier frequency rw, , or 1 — 7», represents the ratio of the 
power lost at 7w, to the sum of P, and of the total power lost at the 
frequencies rw, , 2rw,, °°: , Nw,. 

Next we show that the behavior of the multiplier with respect to 
small amplitude and phase fluctuations is related in a simple way 
to the efficiencies 71, 72, and so on. For instance, in the case of very 
slow fluctuations, the PM scattering parameters of a doubler are 


given by the matrix, 
— 11 Ne | 
2 1— 27.(1 — ™) 


In the last two sections we examine the conditions of absolute 
stability and show that the multiplier may become unstable for some 
circuit conditions if 


nm > 1/N. 
Tf, on the other hand, 
am < 1/N, 
then the multiplier is absolutely stable if and only if 
n, < 50%, for r=1,---,N/2. 


Finally, the important particular case is considered in which all 
the losses of the multipler occur in the series resistance of the varac- 
tor. It is found that in this case absolute stability is obtained if 
and only if 

Me 


> 0.06, N =2, 


We 


“->01, N>2, 


We 


HARMONIC GENERATOR STABILITY 889 


where w, is the cutoff frequency of the varactor. If these conditions 
are satisfied, then the efficiency of the multiplier is found to be so 
low that the conclusion is reached that in most cases of practical 
interest restrictions must be placed on the available circuit configura- 
tion in order to obtain stability. 


II. SCATTERING RELATIONS 


Nominally driven abrupt-junction varactor frequency multipliers 
of order 2” come under the general class of pumped nonlinear sys- 
tems, and the general method presented in Ref. 5 can be used for 
such systems to obtain the scattering parameters which relate small- 
signal fluctuations that may be present at various points in the sys- 
tem.* These small-signal fluctuations are assumed to be small and 
they are at frequencies close to the carriers. 

The varactor model that we use is shown in Fig. 1. It is a variable 


S(t) Rig 
Fig. 1— Varactor model. 


capacitance in series with a resistance R,. The multiplier has the 
minimum number of idlers. The linear passive circuits used in the 
multiplier as input, output, and idler terminations are assumed to 
produce no amplitude to phase or phase to amplitude conversion.* 
If input, output, and all idler circuits are tuned, it can be shown** 
that the small-signal terminal relations of a harmonic generator 
can be expressed in the form (see Fig. 2) 








| (m,), Son O | m): ] 
(7, )on etl : is (W3)on (1) 
(9,)s (63), 

L(8,)an Lipa So» J (8;)on J 


* Notation in this paper is identical to that in Refs. 4 and 5. Details of these 
notations are not given in this paper for the sake of brevity. 

+ This condition is satisfied by circuits usually used with multipliers.4 

t Tuning of idlers, and input and output circuits usually gives near optimum 
efficiency for the multipliers. (See Refs. 7, 8, and 9.) 
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Fig. 2—Small-signal terminal behavior of a harmonic generator of order 2". 
m is the AM index and @ is the PM index of the multiplier. 


or* 


= 


= Sa (2) 


where § is the scattering matrix of the multiplier and (m,); is the 
incident AM index at carrier frequency jw, , (6,), is the reflected PM 
index at carrier frequency kw, , and so on. The small-signal fluctuations 
in the vicinity of carrier frequency kw, are assumed to be at kw, - w, 
wo < w,/2. 

It can also be shown* that the AM scattering matrix S,, and the 
PM scattering matrix S,, are independent of the bias source impedance 
Z,, and that the stability of the multiplier is completely determined 
by S.. and S,,. It can also be shown* that a multiplier of order 2” is 
stable with respect to its AM fluctuations for all input, output, and 
idler terminations. In this paper we shall, therefore, obtain an expression 
for S,, for a varactor harmonic generator of order 2” with the minimum 
number of idlerst and consider its PM stability. 

An abrupt-junction varactor multiplier of order 2” with the least 
number of idlers can be shown*’’ to be completely equivalent to a 
cascade of n lossless doublers{ as shown in Fig. 3. Z.,,0 Sk Sn, is 
the termination impedance in the vicinity of carrier frequency 2*w, . 


* A column matrix is written in the form a, a matrix which is square is written 
as A, and a unit matrix of order n is written as 1,. 

* Methods given in Ref. 5 can, in all cases, be used to obtain S in equation (2). 

tThe conditions under which a multiplier of order Mi: X Mz is completely 
equivalent to a cascade of two multipliers of order M1 and Mz are given in 


Ref. 5. 
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Fig. 3— Equivalence of an abrupt-junction varactor multiplier of order 2” to 
a chain of n doublers. R, is the series resistance of the varactor diode. 


Input and output circuits which are not shown in Fig. 4 can be any 
arbitrary linear passive circuits. For w/w, « 1, it can be shown that 
the AM scattering matrix S,, and PM scattering matrix S,, of the 
k* lossless doubler are given” by 


a. =|} 7 (3) 

l 0 

S =|" tI} | (4) 
2 1 


Now let us consider the kt lossless doubler. The “input impedance” 


(Rox)in and the “load impedance” (Rox) out of the lossless doubler are 
given by®” 


and 


Sox 
Rau = tel, isksn (5) 
and 
CR ox) out = a1 Baten, 1 < k = n. (6) 


= gti | Sox | w, ) 


Since all impedances are purely resistive, we can define partial 
efficiencies 72:’s by the relations 


(PR oce+1) lin 


.= > °° —cxe—_— O<sksn-1 
a Zor + Re + (Rocesn din ’ re 7) 
Zokt+Re 
Oe \\ 0 
ZK+R, 
+ Po tet]. [Ro(k+1)]., 
(On ed 


Fig. 4— An interstage network used with the multiplier. 
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and 


Zi: 

Dae FRE Zi, . 

where Z, is the load resistance connected to the multiplier. We notice 
that y.., 1 S k S n — 1 is equal to the ratio of carrier power flowing 
into the input port of (k + 1)* doubler to that supplied by the k* 
doubler, 7, the ratio of carrier power supplied to the first doubler to 
the power supplied by the pump, and that 7. is the ratio of power 
dissipated in the load resistor Z, to that supplied by the n doubler. 
The over-all efficiency 7, of the multiplier can, therefore, be written as 


oa II ner. (9) 


Consider Fig. 4. The scattering matrix of the (k + 1) interstage 
network can be shown to be® & 1° 


b aa |. (10) 
1 1 — nex 


If w/w) < 1, we can then show from equations (4) and (10) that the 
PM scattering matrix S,, for the multiplier shown in Fig. 3 can be 
written as 


0 C=)", 
Si» = ne . Gd 
2 Aet do (=2) nantan—1 +++ Mon-r — (— 2)" 


III. DERIVATION OF THE ABSOLUTE STABILITY CONDITIONS 

First, consider the case of a doubler. The scattering matrix of a 
stage consisting of an ideal doubler with two resistances R,, and Rs» 
connected* in series to the input and output ports, respectively, is: 


P —~NNe | ; (12) 
2 1 — 2n(1 — m) 


By means of standard techniques,* one obtains that absolute stabil- 
ity requires thatt 


2712 + | a 2n2 is 27102 | <1 (13) 


* Notice that Rai = Z1 + R, and Riz = Z2 + Re. 

+Put n = 1, N = 2, nt = m2 in equation (11). See also Appendix A for an 
alternate derivation of absolute stability conditions. 

* Condition (13) requires that the magnitude of the largest output reflection 
that can be obtained when the termination of the input port is passive be less 
than unity. 
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which is satisfied if and only if 


m < 0.5. (14) 


It is important to notice that (14) shows that the output circuit 
losses do not have any effect on the absolute stability conditions of 
a doubler. This property will be used in the following discussion of 
the absolute stability of a multiplier of order N > 2. 

Consider a multplier with n > 1. It can be shown that in this case 
it is necessary and sufficient that 


m < 0.5, nz < 0.5, +++, nv < 0.5. (15) 


The fact that (15) guarantees absolute stability follows directly 
from (14) and the fact that a chain of absolutely stable stages is 
stable. 

In order to show the necessity of (15), consider the k ideal doubler 
of Fig. 5, and the impedances presented to its input and output ports 
by the remaining part of the circuit. The impedance presented to the 
input port is given by 


Lit a Lor-s + R, + Zo, (k-1) ° (16) 


Since Z,,(z-1) approaches zero as the magnitude of Z,:-. approaches 
infinity, Z,,, can have all complex values with nonnegative real part. 
Furthermore, the impedance Z,,, terminating the output port of the 
k* ideal doubler has arbitrary imaginary part, because of the presence 
of Z,.. Therefore, since (14) shows that the absolute stability of a 
doubler does not depend on the real part of the output impedance, 
one concludes that it is necessary that 


Nor < 0.5, OsSksn-Il, (17) 


if the chain is to be stable for all allowable values of Zj:-:, Zoi, and 
Zonta . 











Zhk Zk Zok Zak 


Fig. 5 — Lossless k‘'® doubler. 
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IV. ABSOLUTE STABILITY CONDITIONS 


Equation (15) shows that a multiplier of order N = 2” of the type 
considered in this paper will become unstable for some frequency 
characteristics of the input, output, and idler circuits, if the efficiency 
nt 18 greater than N-, that is, if 


n. > 1/N. (18) 


Therefore, if equation (17) is satisfied then the circuit must satisfy 
certain conditions such as those derived in Refs. 2, 3, and 4, in order 
that the multiplier be stable. If, on the other hand, 


nm <1/N, (19) 


then (15) shows that the multiplier will be stable for all circuit con- 
ditions if and only if the efficiencies of the input and idler circuits 
are all less than 50 per cent. 

At this point the particular case 


Zoe = 0, Osksn (20) 


deserves special attention. This represents in fact the important case 
in which all the losses occur in the series resistance R, of the varactor. 
It will be assumed that the output load has the particular value which 
gives maximum efficiency.” 

For the absolute PM stability of a doubler, equation (15) requires 
that 


m < 0.5. (21) 

We can show* that this condition can only be satisfied if and only 
if the overall efficiency 

ne < 36%. (22) 


In the case of a quadrupler, the condition of absolute stability 
requires that 


and 
Ne < 0.5. (24) 
We can showy that equations (23) and (24) can be satisfied if and 
* From Ref. 7, p. 331, m < 0.5 for wo/we > 0.06. For this value of w./we, nt < 


36 percent. 
+See Ref. 7, pp. 364-365. 
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only if 
m < 0.7%. (25) - 


It therefore follows that for absolute stability of multipliers of order 
2”, it is necessary that 


mK 2. (26) 


Thus, if (15) is satisfied then the multiplier is so inefficient that 
it becomes of little practical interest. Therefore one concludes that, 
if all the losses occur in the series resistance of the varactor, in most 
cases of practical interest the question of stability cannot be neglected 
and the frequency characteristics of the input, output, and idler cir- 
cuits have to satisfy certain restrictions (such as those given in Refs. 
2, 3, and 4) in order to guarantee stability of the multiplier. 


V. RESULTS AND CONCLUSIONS 


Scattering relations for lossy abrupt-junction varactor harmonic 
generators are presented in this paper. Explicit expressions have been 
given for the PM scattering parameters of the multiplier in terms 
of partial efficiencies defined for the multiplier. 

Absolute PM stability of 2” multipliers is then considered. It is 
shown that the multiplier is stable if and only if 


nei < 0.5, Osjsn-l. (15) 


We have also shown that a multiplier of order 2” and having all 
the losses occur in the series resistance R, of the varactor diode is 
absolutely stable if its efficiency is much lower than 2, the inverse 
of order of multiplication of the multiplier. 

The problem of stability is then of major importance in all high 
efficiency varactor multipliers and proper circuits should always be 
designed to assure at least the conditional stability of these multi- 
pliers.?-! 


APPENDIX 
PM Stability of 2" Multipliers 

Let us investigate by an alternate method absolute PM stability 
of 2” multipliers for n > 1. Let us consider the k® lossless doubler 
(see Fig. 5) in the equivalent circuit shown in Fig. 3. 

If Z;, and Z,);, are the phase termirating impedances of the kt 
lossless doubler (see Fig. 5), we can derive from equations (4) 
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through (6) that 


2 2 
ele (2*) | 
yack 1) w, = 


i 
[Z..e-»/R.] + [Zer-/R,) + 1 — ges mae “* 








a! 


1 w, , 1 Bits (&) 
~ get mao +2 a/R + Zar FU)? 
1<k<n _ (28) 
where m, is the modulation ratio of the varactor at carrier frequency 
ec Z,x’8 are all linear passive impedances, it is seen from eqs. (27) 


and (28) that the multiplier is absolutely stable with respect to PM 
fluctuations if and only if 








ee (2s) <1, l1<kn. (29) 


If any of these conditions are not satisfied, the multiplier will become 
unstable for a certain set of Z,.’s. 
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Computer-Aided Analysis of 


Cassegrain Antennas 


By H. ZUCKER and W. H. TERLEY 
(Manuscript received December 8, 1968) 


A method of analyzing, in detail, the performance of symmetrical Cas- 
segrain antennas has been developed that uses a digital computer efficiently. 
For a specified antenna geometry and feed excitation, the program will 
compute and graphically display the amplitude and phase illumination 
of the subreflector, main reflector, and far-field pattern. These results may 
be used to optimize antenna performance by changing parameters and ob- 
serving the effect. 

Analysis of a Cassegrain antenna with a near-field conical horn feed is 
discussed as an application of the method. Because the radiation character- 
istics of the horn are determined by the horn flare angle rather than the 
horn aperture, broadband performance 1s cbtained. It was indeed found that 
a 50 per cent bandwidth is achieved with a dual mode TE,, — TM,, mode 
feed, provided the proper phase relationship between the modes can be 
maintained over the band. For dual mede excitation an aperture efficiency 
of 70% and a noise temperature due to the power loss at the sub and main 
reflectors of less than 6.5°K was obtained. For a single mode feed (TE,,), 
there was a degradation in the E-plane side lobe levels and a corresponding 
10°K increase in noise temperature. Excitation in the TM, mode was also 
examined for angle-error sensing purposes. Also, the antenna can be used 
with reasonable efficiency well below the design frequency in which case it 
functions as a far-field fed Cassegrain antenna. 


I. INTRODUCTION 


The essential radiation characteristics of multiple reflector anten- 
nas can be predicted very accurately with existing analytical and 
computational methods. Previous work on the open Cassegrain antenna 
showed that good agreement can be achieved between calculated and 
experimental results.t Deviations occurred mainly in the sidelobe 
regions of the radiation patterns, and these had only a small effect on 
overall antenna performance. 
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We are concerned here with a simpler problem, but one of perhaps 
more general interest: the analysis of symmetrical Cassegrain anten- 
nas. We present a computational method in which the amplitude and 
phase illuminations of the subreflector and main-reflector, as well as 
the far-field radiation pattern, are determined in detail, given the 
geometry of the antenna, the dimensions of the feed horn, and the 
excitation modes of the feed. The analysis includes near-field excita- 
tion—an important configuration for broadband operation. Included 
in the program is a graphic routine which plots all radiation patterns, 
the intermediate illuminations and the final far-field results. Because 
of the this feature, and the fact that only seven parameters are 
required to define the geometry of the antenna, the program is par- 
ticularly useful for optimizing antenna performance. 

The symmetry of the antenna results in improved computational 
efficiency. For the open Cassegrain antenna, double-integration was 
required to compute radiation patterns. An approximation recently 
was obtained? which, when applied to symmetrical Cassegrain anten- 
nas, eliminates one integration with only a small reduction in ac- 
curacy.? This makes it possible to compute the radiation char- 
acteristics of large Cassegrain antennas a few hundred wavelengths 
diameter in minutes. 


II. NEAR-FIELD SYMMETRIC CASSEGRAIN ANTENNA 


The antenna under consideration was intended to be used as the 
ground station of a satellite communications system. Wide-bandwidth 
(25 per cent) and low-noise requirements motivated the choice of a 
near-field conical-horn symmetric Cassegrain configuration. The 
near-field feed produces relatively low spillover at the subreflector, 
resulting in a lower noise temperature.* Also, because radiation of the 
feed is virtually confined to the geometrical illumination region of the 
horn,* there is a larger potential bandwidth available. 

An additional requirement had to be explored: operation at about 
Y% nominal frequency, for target-acquisition, by using both TE,, and 
TMoi mode excitation. This leads to the choice of a near-field design 
at the higher frequency because it would tend to function as a con- 
ventional far-field design at the lower frequency. At the lower fre- 
quency there would, however, be a shift of the phase center towards 
the horn aperture, resulting in a phase error in the subreflector il- 
lumination and a consequent reduction in efficiency, but perhaps it 
would be adequate for the intended function, Firally, dual-mode 
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illumination using the TE,, and TM,; modes was of interest because 
of the nearly circular symmetric radiation patterns that can be 
obtained.® 


III. CASSEGRAIN ANTENNA GEOMETRY 


Figure 1 shows the geometry of a Cassegrain antenna. It consists 
of a conical feed horn, a hyperboloid subreflector and a paraboloid 
main reflector. One focal point of the hyperboloid coincides with 
the focal point of the paraboloid and the other focal point with the 
phase center of the horn. The main reflector illumination angle is 
equal to the geometrical subreflector illumination angle, 0,,. The feed 
is located in the geometrical shadow region of the subreflector. 

The initial design of a Cassegrain antenna is usually based on 
geometrical optics, which imposes certain restrictions on the antenna 
geometry. The constraints are that the feed horn be located in the 
shadow region of the subreflector and that the subreflector intercept 
most of the power radiated by the horn. 

To relate the radiation properties of the horn to the antenna 
geometry it is convenient to define a parameter K by: 


he 
K = x sin 4 (1) 
where 
d 


v 
6 


horn aperture diameter 

wavelength 

the angle subtended by the subreflector with respect to the 
center of the horn aperture (Fig. 1). 

For conventional Cassegrain antennas representative values of K 
are from 1.2 to 1.6. For these values of K the major portion of the 
main lobe of a narrow angle horn excited by TE,,; and TM,; modes, 
is intercepted by the subreflector. The lower value of K is preferable 
for TE,1 mode excitation because the major lobe of the horn radia- 
tion pattern is narrower in the # plane than in the H plane. Beyond 
the major lobe region the phase variations are too large for efficient 
subreflector illumination. 

For near field Cassegrains the values for K are much larger, such 
that the radiation characteristics of the horn are primarily determined 
by the horn flare angle. 

For the feed horn to be located in the geometrical shadow region of 
the subreflector it is necessary that angle 6, be not less than angle 6y,, 
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Fig. 1— Geometry of cassegrain antenna. 
that is, 
tan 6, = B tan 0, (2) 


where B is a constant with B > 1. Figure 1 shows the angles 6, and 
6,. The horn blocking angle 6,;, can be expressed in terms of the 
geometrical parameters and (1) by 


KX sin 6,, 
Dsin (6 + On) (3) 


With all other parameters specified, 6,;, is minimum for 


tan Oo = 


_T. 
B+ Om = 5 (4) 


Similarly, with 6,, also specified, K has a maximum value when (4) holds. 

Using (2) and (8) and expressing 6, in terms of the geometrical 
parameters, the following equation is obtained for the subreflector 
diameter, D. 








2BKfv sin 6, (5) 
. BR) . 
sin (6, + 6) + or Bn 
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where 


f = focal length of the paraboloid main reflector. 


Equation (5) agrees with the previously given condition for no block- 
ing® D = (2KfA)*%. In practical antenna designs BKA/16f is small 
compared with sin(§ + 6,,), hence (5) may be rewritten in terms of 
the main reflector diameter, Do, as: 


Bn | KX D 
Dea (2) pee at ert 6 
meuain ¢) sin (6 -+ On) ©) 


Equation (6) shows that antennas with large main reflectors also 
require larger subreflectors, but that the ratio (D/D,)? which is a 
measure of the amount of power blocked by the subreflector is in- 
versely proportional to the main aperture diameter D,. Hence the 
condition (6) is of importance primarily in the design of relatively 
small Cassegrain antennas. 

Equation (6) also shows, as expected, that a conventional Casse- 
grain antenna requires a smaller subreflector than a near-field Cas- 
segrain antenna, since K is smaller for the former. However, this dis- 
advantage of the near-field Cassegrain antenna is offset by other 
advantageous properties. 

Another parameter which influences the antenna design is the 
total Fresnel number of the horn at the subreflector distance, defined 


by 
d’ {1 ] 
r= £544). (7) 


For a conventional Cassegrain, F; can be selected, to a certain ex- 
tent, independently of the antenna geometry, because the radiation 
properties of the horn are not directly related to the horn length, 0. 
For such an antenna with combined TE,; and TM; mode excitation, 
a total Fresnel number in the 0.5-0.65 range would provide a nearly 
uniform subreflector illumination over a wide frequency range (about 
30 per cent) with relatively small phase deviations. For TE,; mode 
excitation, a lower Fresnel number is necessary, because the phase 
of the #-plane horn radiation pattern is more frequency sensitive for 
larger Fresnel numbers. 

For a near field Cassegrain antenna the total Fresnel number is 
almost directly related to the antenna geometry. Specifically, for an 
antenna with the horn located in the shadow region of the subreflector 
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and with a subreflector illumination angle equal to the horn flare 
angle, F; is given by 





_ D __ tan A, 
ome: eats -) 
L, +? lL; 
with 
Z 
tan 0, = (9) 


ja), 


Since I, is much smaller than [, the total Fresnel number, F;, is 
mainly determined by the subreflector diameter D. 

For a near field Cassegrain it is necessary to have both K and F; 
large. Equations (3) and (8) show that these quantities are propor- 
tional to D?. 


Iv. ANTENNA DIMENSION 


For the antenna under consideration the main reflector dimensions 
were specified. Its diameter, D, is 224 (A = wavelength at the design 
frequency), its focal length, f, is 72.8A and the corresponding geo- 
metrical illumination angle, 6,,, 18 75°. 

The initial choice of the other antenna dimensions was based on 
the following consideration. As shown above, K has a maximum for 
8 + Om = 2/2. Using this condition the subreflector diameter, D, has 
been chosen such that the optimum value of K is unity at the lowest 
frequency (Az, = 4.5). At the design frequency, K is about 3 times 
larger than is required for a conventional Cassegrain antenna feed. 
For this value of K, D is 254. With these parameters a horn with a 
maximum diameter of 17.6 can be located in the shadow region of 
the subreflectors. The corresponding horn length, lJ, is 100A. However, 
a feed horn with these dimensions would introduce, at the lowest 
frequency, appreciable phase variations at the subreflector owing to 
the shift of the phase center of the horn radiation pattern. For this 
reason these horn dimensions were not used in the computations. 

The horn dimensions used were d = 14d and l = 42.5\. With these 
horn dimensions K is only slightly less than the optimum value. The 
location of the phase center, which is 5\ in the front of the horn 
vertex, and the subreflector illumination angle, which is less than 
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the horn flare angle, were chosen on the basis of the computed horn 
radiation patterns for combined TE,, and TM, mode excitations. 
Table I summarizes the antenna dimensions. 


V. PROGRAM FOR COMPUTING ANTENNA CHARACTERISTICS 


Programs have been developed which compute the antenna radia- 
tion characteristics and plot the computed radiation patterns. The 
computational methods are similar to those used in the computation 
of characteristics of the open Cassegrain antenna. However, only 
single integrations are used; one integration was eliminated by using 
the Fresnel region approximation for wide angles and large Fresnel 
numbers.? Appendix A gives the equations used. Appendix B discusses 
operational aspects of the programs and gives flow diagrams. 

The antenna characteristics for combined TE,,-IT'My, and TMoi 
mode excitations are computed in one operation. The computer pro- 
gram consists of three parts which compute (z) the horn radiation 
patterns, (i) the subreflector radiation patterns, and (wz) the far 
field radiation patterns. 

The horn radiation patterns are computed at a constant radius, s, 
corresponding to the subreflector distance. From these computations 
the power loss at the subreflector is obtained by integration. The horn 
radiation patterns are also computed at the subreflector surface to 
obtain the subreflector illumination. 

The subreflector radiation patterns are computed at a constant 
radius, f, and at the main reflector surface. From these computations 
the power loss at the main reflector and the main reflector illumina- 
tion is obtained. 

From the computed main reflector illumination the aperture gain, 
aperture efficiency and finally the far field radiation patterns are 
obtained. 

The antenna gain and antenna efficiency are determined from the 


TABLE [— ANTENNA DIMENSIONS 





Main reflector diameter, D, 224d 
Focal length, f 72.8r 
Main reflector illumination angle, 6, 75° 
Subreflector diameter, D 25d 
Subreflector illumination angle, 6, 9.5° 
Horn length, J 42.5r 
Horn flare angle, a 9.5° 


Phase center location, p 5.00 
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computed aperture gain and efficiency respectively by including the 
loss at the subreflector and main reflector. 

In the computations the phase variations at the sub- and main 
reflectors are included. Also included is the effect of the main reflec- 
tor aperture blocking by the subreflector but not the effect of the 
protruding horn. 

An estimate of the antenna noise temperature is obtained by as- 
suming, somewhat arbitrarily, that near the horizon half the power 
lost at the sub- and main reflectors contributes to noise. At zenith 
it is assumed that the power lost at the main reflector contributes 
to noise. A ground temperature of 300°K is used in the computations. 
The additional noise from possible scattering of the subreflector sup- 
port and the noise from the wide angle sidelobes of the far field radia- 
tion patterns are not included in the computations. 


VI. COMPUTED ANTENNA CHARACTERISTICS 


The antenna characteristics have been computed with the above 
computer program for the following feed horn excitations: (2) TEi1 
and TM), at the design frequency, f,, 0.8 f. and 1.3 f,, and (wz) TE 
and TMo; modes at f, and 0.22 f,. The antenna characteristics for the 
different modes and frequencies are summarized in Table II. The 
tabulated power losses are normalized with respect to the total power 
radiated by the feed horn. 


6.1 TE,, and TM,, Mode Excitations 


The computations were performed by assuming that the two modes 
are in phase at the horn aperture. The TM; to TE; power ratio 
was assumed to be 0.17. This value was used because at the design 
frequency it minimizes the phase variations of the horn radiation 
patterns at the subreflector both in the E and H planes. 

Computations have been performed at the design frequency, f,, 0.8 
fo, and 1.3 f,. This corresponds to about a 50 per cent bandwidth. 
Except for the expected change in the antenna gain, the antenna 
radiation characteristics remain virtually the same across this fre- 
quency range. This indicates that if a frequency-insensitive conical 
feed-horn using TH,, and TM,,; mode excitation could be developed, 
this antenna design is capable of efficient radiation over a 50 per cent 
bandwidth. 

Figures 2 through 6 show the antenna radiation patterns at the 
design frequency, f,. Included are: the amplitude and phase of the 
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TABLE II— CaLcuLATED ANTENNA CHARACTERISTICS 


























Frequency Design, fo 0.8 fo 1.3 fo to 0.222 fo 
= ee a SS TEn aoe een ee Se 
Mode and TMa TE and TMi TEn TEn TMoa 
TMu 
Per cent of: 
Power loss at sub- 
reflector 2.7| 20.0 3.4 2.1 9.3 26.0} 50.8 
Power loss at main 
reflector 0.7 2.9 0.9 0.5 1.0 4.4 5.6 
Power blocked by sub- 
reflector 5.0) — 4.8 5.5 3.3 2.9) — 
Aperture efficiency 72.7) — 73.4 | 71.5} 75.4 82.2) — 
Antenna efficiency 70.44 — 70.3 | 69.8] 67.7 57.2; — 
Antenna gain, dB 55.4] 49.4 53.4 57.6} 55.2 41.4) 33.3 
at max at max 
Near 
Antenna noise horizon 5.1| 34.35 6.45 3.9| 15.451) 45.6) 81.6 

















temperature, 
°K At 
zenith 2.1 8.7 2.7 1.5 3.0 13.2} 16.8 


H plane |—22.7 —23.9 |—21.4)-—24.1 ||—-17.2 
First —12.9 —— ——— 
sidelobe, dB E plane |—24.5 —23.8 |—24.2)/—15.1 ||—21.0 
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Fig. 2— Subreflector illumination, TE and TMn modes, freq. = fo, 
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Fig. 3— Horn radiation pattern at distance S (TEn and TMu modes, freq. 


= fo) 


subreflector illumination, the amplitude and phase of the horn radia- 
tion pattern at the subreflector distance, the amplitude and phase 
of the main reflector illumination, the amplitude and phase of the 
subreflector radiation pattern at the focal distance, and the far field 
pattern. 

These figures show that the phase variations of the sub- and main 
reflector illuminations are relatively small at this frequency. This is 
because the location of the phase center and the ratio of the TMi, 
to TE,, modes has been chosen to minimize the phase variations at 
the subreflector. These figures also show that the far field radiation 
pattern is virtually the same in the E and H planes. This should 
result in a nearly circular symmetric far field radiation pattern. 

Figures 7 and 8, and Figures 9 and 10 show some of the antenna 
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Fig. 4— Main reflector illumination (TE, and TMun modes, freq. = fo), 
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Fig. 5—Subreflector radiation pattern at focal distance (TEn and TMn 
modes, freq. = fo) 


radiation characteristics at 0.8 f, and at 1.3 f,, respectively. The phase 
variations of the sub- and main reflector illuminations, though small, 
are larger than at the design frequency. This primarily results from 
the shift in the phase center of the horn radiation pattern. It is the 
shift in the phase center which ultimately limits the upper frequency 
of operation for this type of antenna. 


6.2 TM,, Mode Excitation 


The radiation characteristics for the TMo; mode excitation were 
computed at the design frequency, f,. Figures 11 through 13 show 
representative radiation patterns for this mode. Particularly pro- 
nounced are the amplitude oscillations of the main and subreflector 
illuminations. This seems to be characteristic for the TMo; radiation 
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Fig. 6— Far field radiation pattern (TEu and TM modes, freq. = fu). 
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Fig. 7 — Subreflector illumination (TE: and TMi modes, freq. = 0.80 * fo). 


patterns from apertures and reflectors which are large compared with 
the wavelength and which are illuminated with nearly spherical wave 
fronts. However, no experimental evidence has been found to confirm 
these characteristics. 

The advantage of this antenna for TMo: mode excitation com- 
pared with a conventional Cassegrain is less spillover at the subre- 
flector, hence, less antenna noise. However, the sidelobe levels of the 
far field radiation pattern are perhaps a few dB higher than could 
be obtained with a conventional Cassegrain. 


6.3 TE,, Mode Excitation 


In view of the difficulties in realizing a conical feed horn with TE, 
and TM,; mode excitation which would maintain the proper phase 
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Fig. 8— Far field radiation pattern (TEy and TMu modes, freq. = 0.80 x fo), 
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Fig. 9 — Subreflector illumination (TE: and TMu modes, freq. = 1.30 X fa). 


relationship at the horn aperture over a wide frequency range, TH 
mode excitation only has been investigated for the same antenna 
geometry. 

Figures 14 and 15 show some of the radiation patterns for this 
mode at the design frequency, f,. The computations show that the 
phase variations of the sub- and main reflector illuminations are 
considerably larger, particularly in the E plane, compared with those 
obtained by using combined TE,, and TM,, mode excitations. Also, 
the sidelobe levels of the far field radiation pattern in the E plane are 
considerably higher than in the H plane. 

Table II shows that the computed antenna gain is 0.2 dB lower 
than the computed gain for TE,, and TM; modes. However, the 
most significant difference is the increase in the antenna noise tem- 
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Fig. 10 — Far field radiation pattern (TEn and TMu modes, freq. = 1.30 x fa). 
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Fig. 11— Horn radiation pattern at distance S (TMu mode, freq. = fo). 


perature by 10°K near the horizon. This increase is primarily caused 
by the larger power loss at the subreflector because of the E plane 
horn radiation pattern characteristics. 

A reduction of the antenna noise temperature by a few degrees 
might be possible by increasing the subreflector illumination angle 
perhaps even beyond the geometrical illumination angle of the horn. 
Figure 14 shows that the phase variations in the E-plane radiation 
pattern are not very large in the vicinity of the presently-used sub- 
reflector illumination angle of 9.5 degrees. The computed horn power 
radiation patterns show that if in the present design the illumination 
angle were 10.5 degrees the antenna noise temperature would be re- 
duced by 4.8°K. The antenna gain for such a design would be reduced 
by only a small amount. 
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Fig. 12 — Main reflector illumination (TMo mode, freq. = fo). 
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Fig. 13 — Far field radiation pattern (TMo: mode, freq. = fo). 


The bandwidth characteristics for this mode in the vicinity of the 
design frequency should be similar to those of the combined TE, 
and TM), modes. 


6.4 TE,, and LM 9, Mode Excitation at 0.22 f, 


The horn and far field radiation patterns for these modes are shown 
in Figs. 16 through 19. The right side of Table II summarizes the 
computed antenna performance at 0.22 f,. The antenna efficiency for 
the TE,; mode is relatively high particularly in view of the large 
phase variations of the subreflector illumination. The far field radia- 
tion patterns for both the TE,; and TMo, modes show good charac- 
teristics. The primary disadvantages, however, are the high noise 
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Fig. 14— Horn radiation pattern at distance S (TEu mode, freq. = fo). 
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Fig. 15 — Far field radiation pattern (TEu mode, freq. = fo). 


temperatures for both modes, owing to the power loss at the subre- 
flector. 


VII. SUMMARY AND CONCLUSIONS 


Computer programs have been developed for computing the radia- 
tion characteristics of Cassegrain antennas and for plotting of the 
computed radiation patterns. The method is applicable to symmetrical 
Cassegrain antennas and provides the means of their design for nearly 
optimum performance. 

A Cassegrain antenna with a near field conical feed horn has been 
investigated for different mode excitations and over a wide frequency 
range. For large antennas (over 200 wavelength main reflector diam- 
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Fig. 16 — Horn radiation pattern at distance S (TE: mode, freq. = 0.22 x f.). 
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Fig. 17 — Far field radiation pattern (TE: mode, freq. = 0.22 x fo). 


eter) this type of feed can be used over a 50 per cent bandwidth 
with only small variations in the over-all antenna characteristics, 
except for the predictable increase in the antenna gain with frequency. 

The computed antenna characteristics for the combined TE,i and 
TM,; mode excitations show that the advantages of the combined 
excitation are: (7) lower far field E-plane sidelobes, (2%) 0.2 dB 
higher antenna gain, and (22) 10°K lower antenna noise temperature. 

At the design frequency for TE,; mode excitation, the computed 
antenna, efficiency is 70 per cent and the noise temperature near hori- 
zon 15.5°K. With a design modification it should be possible to 
reduce the noise temperature by a few degrees without affecting the 
antenna gain. 

For the TMo: mode, the calculated antenna gain and noise tem- 
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Fig. 18 — Horn radiation pattern at distance S (TMa mode, freq. = 0.22 X fo). 
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Fig. 19-— Far ficld radiation pattern (TEu mode, freq. = 0.22 X fo). 


perature at the design frequency are superior to those obtainable by 
using a conventional feed. The sidelobe levels of the far field radia- 
tion pattern are perhaps a few dB higher. 

At a frequency 4.5 times below the design frequency the calculated 
antenna efficiencies for the TE,,; and TMo; modes are relatively high. 
However, the power loss at the subreflector gives rise to appreciable 
noise temperatures near the horizon. 


APPENDIX A 


Formulations Used in Computing the Characteristics of Cassegrain 
Antennas 


A.1 Horn Radiation Patterns 


The horn radiation patterns have been computed by using the Kirch- 
hoff approximation to the aperture radiation field. With this approxi- 
mation the electric field E, at distances of at least a few wavelengths 
from the aperture is:’ 


_ ik i i aye ie a 
E, => Ae : [E,(1 + 1, 1p) E, 1211, + 1z)] R ds (10) 
where 
S = horn aperture area 


Qn 
d 


\ = wavelength 
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and I, and 1, are unit vectors in the normal, and in the FR direction as 
shown in Fig. 20. E, is the electric field in the horn aperture, assumed to 
be the sume as for circular waveguide modes but with spherical wave 
fronts . 

Since it has been shown by actual computations’ that the primary 
contributions to the horn radiation patterns are due to the first terms 
of (10), in the computations the term E-1,(1, + 1,) has been neglected. 

Because of the periodicity of the e “R/R with respect to the azimuth 
coordinates »’ and ¢,, it is sufficient to evaluate (10) in a discrete 
number of ¢, planes, the number |.eing equal to number of Fourier 
components of the aperture field in ¢’. In particular it has been shown’ 
that for TH,, and 7M,, mode excitations it is sufficient to evaluate one 
rectilinear x or y component of (10) in the two principal planes ¢, = 0 
and g, = 7/2. Similarly for 717,, mode excitation only one rectilinear 
component of E, in one plane needs to be evaluated. 

The integrals which are evaluated fur the T7#,,, TA/1,, and TM 
modes are: 


kP? a 20 gr iee 
£55 >= = / i Ey (A + I,-1e) sin 6’ da’ dey’ (11) 
Aa 0 0 R 
with the aperture fields for the different modes given by: 


TH ,, mode 


, / 
(Lay) te: io ae £) es Jeers ) cos 2" (12) 






VA 
Pir, 64, oy ) 
Shee ay ee P(r, 91) 


Fig. 20 — Coordinates for horn radiation pattern computation, 
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with 
Ji(ker,,) = 0. (13) 
TM 1 mode 
al 6” 0’ / 
(Boy TMi. — Cae ) “Re Topic “| cos 2¢ (14) 
with 
Ji(kouii) = 0. (15) 
TMo5, mode 
a OV in 
(Hay)rme. = Ace 0) sin ¢ (16) 
with 
de Reinec) os 0. (17) 


J, = Bessel functions of order n. 


a = horn flare angle. 


The integration with respect to 9’ has been eliminated by approxi- 
mating the integrals, I,;,, given by: 





24 —-ikR 
Loe / RZ + lade) cosn(e. — od — ¢’). (18) 
0 


The approximations used are modifications of the previously derived* 
approximations, J’, , to the integrals (18) with 1,-1lp = 0. The modifica- 
tions consist of including the values of 1,-1lz at the stationary phase 
points since it has been shown that the previously derived approxima- 
tions, I, , reduce to those obtained by the method of stationary phase, 
and that I, can be separated into terms which correspond to the sta- 
tionary phase terms. It is subsequently shown that R can be expressed 
as: 





k= Ry — a cos (yg, — ¢’) (19) 


where r and wu are functions which are independent of ¢ and ¢’, on this 
basis, the first order approximations to (18), Ii, are: 


ANALYZING ANTENNAS BY COMPUTER 917 


—ik(Ri-u) 


-Gk(Rotu) 
| ee 7 (1 + 1,1) + ae d+ Leeda) lb) 


ge here P grease 
= —_— (l + Lala.) — jp 0 + Ltn) |secens (20) 


with 
94,\? 
R, = (1 ~ 4) (21) 
Du $ 
R, = (1 + mu) (22) 


and 1p, and 1p, are the unit vectors at (¢, — ¢) equal to zero and x 
respectively. 

By using the approximation (20), one integration is eliminated in (11) 
and the radiation patterns for the different modes are computed from 
the following integrals: 

TE,, Mode 


“1,72 a , , 
(Loy) 1811 = a / | V(t ) i ct Hil iteas a sin 6’ dé’ 
0 
(23) 


where the minus signs give the radiation pattern in the plane 9; = 0 
and the plus sign the radiation pattern in the plane 9; = 7/2. 


TM, mode 
The integral is analogous to (28). 
TM, mode 


kl’ f* 6’ : 
(Eyy)rMe = re i. Files on, sin 6’ dé’. (24) 


Referring to Fig. 20, 
R=(P4+rt+p' + 2rp cos 6, — 2lp cos 6’ — 2r,l cos y,)? (25) 
with 
cos 7, = sin 6, sin &’ cos (y’ — ¢,) + cos @, cos 6’ (26) 
and 


pcos —l+r, cosy 


L,-1le = R 
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Hence a comparison of (25) with (19) shows that 


r=(P +72 4+ p? + 2np cos 6, — 2lp cos 6’ — 2r,l cos 6, cos 6’)? (28) 


and 


= r,Usin 6, sin 6’ 


r (29) 


The integrals for the difference modes have been computed at two 
values of 7,: (1) at the subreflector surface to obtain the subreflector 
illumination, and (2) at a constant distance corresponding to short- 
est distance, s, from the subrefiector to the horn aperture. The latter 
was performed to obtain the horn radiation pattern in a form which 
is readily measurable and convenient for subsequent computation of 
the horn power radiation pattern used for determining the power loss 
at the subreflector. 

For the first computation reterring to Fig. 21 








2 S38) 
"2 cos 6 — B (30) 
with 
b 
B= = 
P(r,0,6) 
PHASE 
eee anaes oe 


Fig. 21 —Subreflector coordinates. 
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For determining the subreflector illumination it is preferable to 
obtain the illumination in terms of subreflector coordinate 6.2. The 
relationship between the coordinates 6, and 62 is: 


1+ p? + 26 cos 6. 


cos 6, = (31) 


The second computation 


nS 5 + 8) (32) 


and the integration is performed as a function of 6,. 

A comparison has been made between some radiation patterns 
computed by single and double integration. Good agreement was 
obtained. 

The electric field in the spherical 6; and 9; coordinates can be ex- 
pressed in terms of the radiation patterns in the principal planes 
gq= 0, a/2. 

For TE,; and TM, modes 


E, = 1B no) sin g; + 1,,F,,(0) COS $1 (33) 
The TMy; mode has only a 6; component given by (24). 


A.2 Horn Power Radiation Patterns 


The horn power radiation pattern, P;,, is computed from the fol- 
lowing integral: 


2 27 A, 
Py = Hf / E,:E* sin 6 d6 dg. (34) 
2n Jo 0 
n = free space intrinsic impedance. 


The total power is obtained by extending the range of 8, to the region 
where E,-E* has a negligible value. 

The total radiated power can also be obtained from the assumed 
fields at the aperture (12) through (17). On this basis the total 
power for the different modes is approximately 


TH, mode 
_ Qr(la)? _ » | Zebra , 
Pi, 7 Qn [ TEia1 1] aces (35) 
TM 1 mode 
Qr(la)* 
Pi ig 5: (36) 


2n 
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TM 51 mode 


T 





2 

Pu, — 5S Ji(Ker ates) (37) 

The computations performed for the different modes and _fre- 
quencies by using (34) are in agreement with (85) through (37) 
within 1.8 per cent, with the power computed by using (84) giving 
larger values for all modes. This is at least partially caused by the 
higher values that the approximations J,1 give compared with those 
obtained by precise numerical integration.? 


A.3 Subreflector Radiation Patterns 


The subreflector radiation patterns have been computed by using 
the surface integral relating the radiated fields and the current dis- 
tribution over a surface.” For the distance of at least a few wave- 
lengths from the reflector the radiated electric field with reference 
to Fig. 21 is: 


—ik(Retrea) 


i, = i Lr, x (J x io 
Ss 


R. ds. (38) 


where 


J = surface current density 
S, = subreflector area. 


To evaluate (38) it has been assumed that the reflector is locally 
plane. With this assumption the current density is directly related to 
the incident electric field. To simplify the computations 1p, was re- 
placed by 1,, . (A test computation of the subreflector radiation pattern 
of the open Cassegrain antenna showed that using 1,, instead of lz, 
results in a negligible difference.) 

For a hyperboloid reflector, the relations between the incident 
electric field and the current density have been derived.t By using 
the approximation (20) the integration with respect to yg. can be 
eliminated. On this basis the radiated fields of the subreflector result- 
ing from the incident fields of a TE, mode (23), are given by: 

In the plane vz = 0, 


. 0 ( 
y a J -jkre nl i T (1 + B COs 02) 1 
[',(0) lors, ms Dr i € {| 22.0) a 2,,() (8 ale cos 6.) |r 


. 7 (w\ +8 cos 2) file. 
i | 2.0 = (7) BES euece: [ejeisin 6246. (39) - 
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In the plane gs = 7/2, 


[el3) |... 


ae we pvikrs (1 + 8 cos a2 | ; 
— DN P {[ 2.00 ae 1,() I, cos 6 





8 + cos @ 
(1 + 8 cos 02 | ; 
~ | 8.0 ~ B®) B + cos 62 fz cos 6 
+ 21,2 ) rare I, sin ae sin 0, d@, . (40) 
2 


For the TM,, mode, the fields are analogous to (39) and (40) 
with the corresponding TM,, illuminated functions H,(0) and FE, (2/2). 
For the TM; mode, 


a Om 
[Z.(O)lem.. = -x | EOE 


1 + B cos 62) 71 
| cos g LL + 8 cos @) Bi -e08 i I, + 
where H,(0) is given by (24). 
In the above rz is the equation of the subreflector in the coordinates 
shown in Fig. 21, and is given by 
e (l= 
aS - = x. (42) 
I, are the first order approximations to the integration with respect 
to yg, . They are given by (20) with 1,-1z, and 1,-1z, set equal to zero. 
The other parameters are: 
= (rf +72 — 2rrz cos y2)'. (43) 


COS Y2 = sin @ sin 6 cos (gy — g2) + cos 6 cos 6. (44) 


Bsin Osin 6, r 


Parag ay ae sin 6, d4s (41) 
2 


Tr. Sin 6 sin Oo 
(? + 73 — 2rrz cos 6 cos 62)! 
The subreflector radiation patterns have been computed at two 
distances: at the paraboloid surface 


(45) 


U2. = 


2f 


"= TF cod bi 
where f = focal length of the main reflector, to obtain the main 
reflector illumination, and at 

od (47) 


to determine the power loss at the main reflector. 
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The ¢ dependence of the subreflector radiation patterns are ob- 
tained in terms of radiation patterns in the principal planes and are: 


For TE,; or TM, modes 


[E.]rr...7 Mas == 1,£,(0) COS + z.(£) sin Y. (48) 


For the TM,>, mode 
[E.}ru., = 1,/,(0). (49) 
The power radiation pattern of the subreflector is computed by using 
the integral (34). 
A.A Aperture Gain and Efficiency 
The aperture gain and efficiency are computed by projecting the 
incident field on the main reflector aperture. The fields in the recti- 
linear x, y components are related to 6, ¢ components by the ex- 
pressions 
Ey = —1,(E,, cos g — E,, cos ¢) 
— 1,(Z,, sin g, + E,, Cos ¢2). (50) 


For TE;,; and TM,; mode excitations polarized in the y direction, 


the gain on axis, Gy,, is 
Om 
i |z.(2) As 00) |? sin 6 d0 
Oe 


2 











Ag” 
Gy = Tr yo ee 2 es (51) 
i | x) + | £,(0) Phe sin 6 dé 
0 
where r is the equation for the paraboloid (46). 
The aperture efficiency, g, is obtained from the relation 
G 
I= & (52) 
with 
_ | 4xfsin 6, _ | 
tee E + COS On) (53) 


The maximum antenna gain for the TMo; mode is determined by 
normalizing the amplitude of the TMo; mode electric field at the 
horn aperture with respect to the amplitudes of the TE, or TE, 
and TM,, modes for the same power input, by using (35 through 37). 
The gain for the TM»; mode is then related to the gain for the TE, 
mode on axis, referred to the maximum of its pattern. 
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A.5 Far Field Radiation Patterns 


The far field patterns are computed from the projected field on 
the aperture, using the relation 


. -ikre 


je 
he. 
7] 





E, = 


On 


‘a E,, exp [jkr sin @sin 0, cos(y — ¢,)|r’ sin 0d6dp — (54) 
0 


where 74, 61, and gq are the coordinates of the far field observation 
point. 

Because of the antenna symmetry the integration with respect to 
y can be the readily performed. The resulting integration with respect 
to 6is for TE,, or TM,, modes 


Fs -ikre Om 
(Epon, = = : | {| 2.0 a 2,() |y.2 sin een s,) 
a 05 


+ |z.(2) — 2,0) \s.(22 sin @ sin 0.) i sin 6d6@ (55) 


where + signs correspond to the patterns in the H(y,=0) or E(y, = 7/2) 
planes. 
Similarly for the TM,, mode 





2 e ike Om 
(E,oms = ly, } 





9 . 
E(0)J; (Fe sin @ sin a) sin 6d@ (56) 
a Ob 
where H#, is the main reflector illumination (48) and (49) for the 
different modes in the planes y = Oand yg = 7/2. 


APPENDIX B 


Program for Computing the Characteristics of Cassegrain Antennas 
and for Graphic Display of Radiation Fields 


The package consists of two programs: a program to compute the 
horn radiation patterns, the subreflector radiation patterns, the far 
field radiation patterns, and other characteristics of Cassegrain an- 
tennas described in Appendix A, and a program to scale, label, and 
plot the radiation fields. The two programs are linked by intermediate 
storage of computed results and control variables on tape, at the 
conclusion of Part 1 execution. 


B.1 Computation Program 


Figure 22 is a logic diagram of the program. The following conven- 
tion is used throughout the logic diagram. Square-bracketed symbols 
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START 


READ NAMELIST 


































PDATA Y <kk=1> N 
CALL SIM, TO CALC. CALL SIM, TO CALC. 
HDATA EHS (N)[(Epy)] @=0 | | EHC (N) [(Epy)]¢=0 
ETMS (N) [(Epy)] TMor| | ETMC (N) [(Epy)]TMo, 
THS [62] =0.0 AT SUBREFLECTOR AT DISTANCE S 
THC [6] =0.0 
THS =THS +Ao THC=THC +A3 
DO 50 N=1t,L 
. TE 
THH [6']=0.90 (50) RECORD 
NO. 
DO 60 J=t, NT 
WRITE 
CALL JO12, CALC. RECORD 
APERTURE FIELDS NO.2 
(Eay)re,. (Eau)om,, nae 
(Eay) tao, RECORD 
NO.3 
<eectN 
CALC. CALC. CALL_PRAD 
RL[?, (@2)] RL [P, (@2)] = Const y 
CTH [cos{d; (62)}]} | CTH! [cos (6))] <Ns=3> (38) 
N 





CALC. 
RN[r], X Cu] 
RO[Ro], Ri[Ri] 
APL [ins IRo| 
AMI [tn tr,] 
CALL IO12, CALC. 


10 Hoh], 14 [Tin] 
T2 [Ton] 
















EVALUATE INTEGRAND 
AT SUBREFLECTOR 


St (J) H-PLANE] TE,, 
S2 (J) E- PLANE} TMy, 


S3 (J) TMo, MODE 





(60) THH=THH+A, 


EVALUATE INTEGRAND 
AT DISTANCE S 


S4(J) H-PLANE] TE,, 
$5 (J) E-PLANEJ TM, 


S6 (J) TMo; MODE 





READ NAMELIST SUBDAT 


CALL QUAD, TO CALC. 
EHSU (NTS) =QUAD[EHS (L)] 
EESU (NTS)=QUAD[EES (L)] 

ETMSU (NTS) =QUAD[ETMS (L)] 


Y <Mm=p>* 
CALC. CALC. 












o ©) 
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cate. RHC[r5] 
XSP [uz] ETC. 


READ NAMELIST 
FFDAT 
CALL QUAD,TO CALC. 
EHPF (NTP) =QUAD[EHP(LS)] 


EEPF (NTP) =QUAD[EEP(LS)] 
ETMPF(NTP) = QUAD[ETMP(LS)] 


TPF [62] =0.0 


DO 700 M=1,LP 





CALL 1012, catc. 1OS[Ig}, 
HS[1{] l2S[15] with 
In *'Ro = int IR =O 














EVALUATE INTEGRAND 
AT PARABLOID SURFACE 


SHP (kK) eRe 


EVALUATE INTEGRAND 
AT DISTANCE f° 
ee ae 
























SEP (K) E- PLANE] TM), 
STMP(K) TM MODE 


SEPC(K)E-PLANE] TMi; 
STMPC(K) TMo, MODE 














TPD [6]= 0.0 


DO 710 J=1, NTP 





TSU=TSU+ Ap 


EVALUATE INTEGRAND 
SHPF (J) H- PLANE|TE,, 
SEPF (J) E- PLANE]ITMy, 


STHPF(J) TMo, MODE 















CALL SIM, CALC. 
EHP (N) [(Es)] ¢=0 
EEP(N)[(E5)] ¢=7/2 


ETMP (N)[(E.)] TMor 
AT PARABLOID SURFACE 


TSP =TSP+As 


CALL SIM, CALC. | 
EHPC (N)[(Es)] ¢=0 
EEPC(N)[(Es)] ¢=2/2 


ETMPC(N)[(Es)] TMoy 
AT DISTANCE f* 


TSP = TSP +Ae 








(10)-{TPD=TPD+A, 


CALL SIM, CALC. 
EHFF (M)[(Ery)] ¢a =0 
EEFF(M)[(Epy)] ¢a=7/2 

ETMFF(M)|(Ep)] TMor 


TPF =TPFt As 
























CONTINUE 


CALL PRAD 
CALL GAIN 









Fig. 22 — Logic diagram for field calculation program. 
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follow the notation of Appendix A, while the preceding alphanumcric 
symbols are the Fortran IV source program names. 

Four groups of input data are required, designated by the NAMELIST 
names PDATA, HDATA, SUBDAT, and FFDAT. Although some of the fol- 
lowing data is redundant, the formats are designed for convenience 
and precaution. 


PDATA 
NS — Number of sets of field patterns to be computed, that is, 
NS = 2 Horn radiation patterns only 
NS = 4 Horn and subreflector patterns 
NS = 5 The above plus far field patterns 
Nps | — Number of patterns per set, that is, 
NPS = 2E & H planes only 
NPS = 3 The above plus TM,; mode 
ITE — Control bit for plotting program (see Section 2) 
HDATA 
HL — Horn length 
HLAMD — Horn length normalized with respect to design wavelength 
FQ — Frequency normalized with respect to design frequency 
Cc — Distance between foci of hyperboloid 
BETA — Defined by equation (30) 
PL — Location of phase center normalized with respect to horn 
length 
ALPHA — Horn flare angle 
L — Number of points at which horn radiation pattern will be 
evaluated 
NT — Number of points at which integrand will be calculated for 
evaluation of integrals in equations (23) and (24). 
pEG — Angular increment (in degrees) for obtaining subreflector 
illumination 
ANG — Angular increment (in degrees) for obtaining horn radiation 
patterns 
T,, .— Complex constant which determines TM,, mode to TE,, 
mode ratio 


UP — Control bit for plotting program (See Section B.2.) 
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SUBDAT 

F — Focal length of paraboloid 
Note: The dimensional unit for Fr, c, HL must be the same. 

FLAMD — Focal length normalized with respect to design wavelength 

CLAMD — C normalized with respect to design wavelength 

LS — Number of points at which subreflector radiation pattern 
will be evaluated 

NTS — Number of points at which integrand will be calculated for 
evaluation of integrals in equations (39), (40), and (41) 

GAMA — Main reflector illumination angle (degrees) 

pEGP — Angular increment (in degrees) for obtaining main reflector 
illumination 

ANGP — Angular increment (in degrees) for obtaining subreflector 
radiation patterns 

incs  — (N + 1) where N is the number of points to be interpolated 
between previously computed subreflector illumination points 

LB — Number of points which are not included in integral (51) 
because of subreflector blocking 

I2P — Control bit for plotting program (See Section B.2.) 

FFDAT 

NTP — Number of points at which integrand will be calculated for 


evaluation of integrals in equations (55) and (56) 

GAMAB — Angular portion (in degrees) of main reflector blocked by 
subreflector 

ntPpB — Number of points which are not included in integrals in 
equations (55) or (56) owing to subreflector blocking after 
interpolation of main reflector illumination 


LP — Number of points at which far field radiation patterns will 
be evaluated 

tncp) — (N + 1) where N is the number of points to be interpolated 
between previously computed main reflector illumination 
points 

prEGFF — Angular increment (in degrees) at which far field radiation 
pattern will be evaluated 

13P — Control bit for plotting program (See Section B.2.) 


The following subprograms must be included in the deck before 
execution. 
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PRAD — Computes the power radiation patterns in accordance with 
equation (34) 
joiz2. —A Bessel function subroutine developed by J. Alan Cochran 


and P. A. Alsberg. The present version includes two sub- 
sidiary subroutines: 


DPHASE — Uses phase-amplitude method for large values 
of argument 

jLow  — Uses downward recursion technique for small 
values of argument 


1012 -——~ Calculates the first order approximations J’, (20) to the 
integrals J,,, (18) 

quap — A quadratic interpolation scheme for complex arrays 

INTERP — A subroutine called by quap 

proc -— A subroutine to format and print output data 

SIM — A complex Simpson’s rule integration routine. Will accept 
an even or odd length array with negligible variation in 
accuracy 

GAIN. — Computes the antenna gain and aperture efficiency as defined 
by equations (51), (52), and (58) 

TR — A special purpose of Simpson’s rule integration to evaluate the 


integrals in equation (51). This function subprogram is called 
only by the GAIN subroutine 


The program requires approximately (52,660) s or (22,000), words 
of storage. A representative execution time for both modes (TEq4 
and TM; combined, and TMp;) at the design frequency is 14 min- 
utes; the same calculations at 0.22 times the design frequency, where 
a smaller number of integration points is required, takes approxi- 
mately 5 minutes. 


B.2 Plotting Program 


A logic diagram of the program is presented in Fig. 23. 

All input data required by the plotting program has been stored 
on tape by the previous program. 

The plotting control bits, referred to in Section 1, have the fol- 
lowing meaning: if field calculations are to be made in a combined 
TE,, and TM,; mode—that is, input data TM,; ~ (0.0, 0.0)—the 
control bit ITE in NAMELIST PDATA must be set equal to 2. If the field 
calculations are to be made in the TE,, mode alone—that is, TMi; = 
(0.0, 0.0)—the control bit should be set equal to 1. Therefore calcu- 
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lations for the three sets of radiation patterns, horn, subreflector, 
and far field, will generally be made in two modes, a combined TEy1 
and TM; mode, and the TMo; mode, where it is understood that the 
combined mode may be the pure TE); mode, if TMy; = (0.0, 0.0) and 
ITh = 1. 

For the combined mode the radiation fields will be evaluated in 
both E and H planes. E- and H-plane data are plotted together for 
ease of comparison. However, in some cases (particularly for certain 
far field patterns) a rapidly varying phase plot superimposed on a 
rapidly varying amplitude plot may result in an unclear graph. For 
this reason the control bits 11p, 12P, and 13P are introduced. 11P con- 
trols horn radiation pattern plotting, 12P controls subreflector plotting, 
and 13P, far field plotting. If the control bit for a particular field is 
set equal to 0, two plots will be generated, that is, 


Combined Mode 
(1) E-plane and H-plane amplitude and phase 


TM,, Mode 
(71) Phase and amplitude 


However, if the control bit is set equal to 1, four plots will be 
generated: 


Combined Mode 


(t) E-plane amplitude and H-plane amplitude 
(12) E-plane phase and H-plane phase 


TM, Mode 


(2) Amplitude 
(iv) Phase 


Vertical scales are restricted to allow only 20 divisions, therefore a 
preferred set of increments for the various scales has been selected. 
The allowed increments in dB for the amplitude scale, stored in 
array ADB(I), are: 


0.5, 1.0, 1.5, 2.0, 2.5, 3.0, 4.0; 
for the phase scale, in degrees in array APH (1) : 


1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 8.0, 10.0, 12.0, 15.0, 18.0; 
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READ 
RECORD SARE 


NO.1 


DO 73 J=i,NS 
READ 
RECORD 


J+! 


DO 72 K=1,NPS 


>» ER(I)=ERi1 (1); EI(I) =EI (1) 
> ER(I)=ERa2(I); EI(I) =El2(1) 
» ER(IJ=ER3 (1); EI(IJ=EI3 (1) 








ame (1) =[[ER (1j]? + (Er cy}?] 2 


me 















FIND N 
AMP (N) =MAx [AMP(I)] 
I=1, K2 
EN= Amp (N) EN =AmpP (1) 
_ oat [EL(N) Se ER 
Se Escad eee Eas 


INTERPOLATE 
REAL PART ER(I) 
INTERPOLATE 
IMAG PART EI (I) 


CALC. RELATIVE AMP 


dB (I)= 20.0 x LOG =e (1) 





Zz 


oy tH 
sara 


CALC. RELATIVE PHASE 
EI (D 





= -1 LS 
RP (I) =TAN Ex | PN 


CONSTRUCT AMP SCALE 
YL12 MAx { dB (I)} 
YS1< min {dB (I)} 


CONSTRUCT PHASE SCALE 
YL2> MAx {RP (1)} 
YS2=-YL2 











CONSTRUCT ANGLE OFF 
AXIS SCALE 
COMPOSE PARTIAL 
TITLES FOR PLOT 
“<> 


SYLI=YLI 
SYSi=YSi1 
SYL2=YL2 
SYS2=YS2 
SDOB(I)= DB(I) 
SRP (1)= RP (1) 





COMPLETE TITLE 
FOR PLOT 


COMPLETE TITLE 
FOR AMP PLOT 


CALL FRAME CALL FRAME 


CALL PLOT 2 PLoT DB (I) 


PLOT 
DB (I), RP (I) 


COMPLETE TITLE 
FOR PHASE PLOT 


CALL FRAME 


PLOT RP (I) 
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YLi=SYLI1 
YSi=SYS!1 








YLi=SYLI 
YSi=SyYSi 


N YL2=SYL2 
aod YS2=SYS2 
Y 
COMPLETE TITLE 
FOR AMP PLOT 


i CALL FRAME | 









COMPLETE TITLE 
FOR DUAL PLOT 


CALL FRAME 











PLOT 
DB (1), SDB (1) 


DB(I), RPC) 
COMPLETE TITLE 
SDB(1), SRP(I) FOR PHASE PLOT 


CALL PLOT 2 





CALL FRAME 





PLOT 
RP (D, SRP(I) 


CONTINUE 
CALL CLEAN 


T 
(72) CONTINUE 


STOP 


Fig. 23 — Logic diagram for field plotting program. 
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for the angle-off-axis scale, in degrees, stored in array AS(I): 


6.0, 5.0, 4.0, 3.0, 2.5, 2.0, 1.5, 1.0, 0.75, 0.5, 0.4, 0.3, 0.2. 


The following subroutines must be included in the deck before 
execution: 


pot 2 — A subroutine to generate a grid with two independently 
labeled ordinates sharing a common abscissa 

MINMAX — A subroutine to select the algebraicly largest or smallest 
entry in an array and specify its index 


INTERP — A quadratic interpolation scheme for real arrays 
FILTER — Adjusts plotting data for phase variations in the vicinity 
of +180° 


LABEL 2 — A modified version of the microfilm subroutine LABEL. 
Called only by pot 2. 


The program requires approximately (51,536), or (22,000)19 words 
of storage, and about 0.5 minute execution time for all twelve plots. 
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Precise 50 to 60 GHz Measurements on 


a Two-Mile Loop of Helix Waveguide 
By D. T. YOUNG and W. D. WARTERS 


Precise measurements made in the 50 GHz to 60 GHz band on a two- 
mile triangular loop of 2 inch diameter helix waveguide are presented. 
The measuring technique is discussed in some detail regarding accuracy. 
A brief comparison of the experimental results with theory 1s made. The 
average measured attenuation of the waveguide varies smoothly from 2.62 dB 
per mile at 50 GHz to 2.32 dB per mile at 60 GHz. Fast variations versus 
frequency were within experimental error. Several short-radius bends of 
different angles were measured; losses less than 0.8 dB across the band 
were observed for a 42° bend made of mitered elbows. 


I. INTRODUCTION 


Low-loss transmission via the TH», mode in circular waveguide 
has been studied for many years for use as a wideband communica- 
tion medium. Much work has been done on the design of improved 
waveguides,” * the understanding of the effects of spurious modes,* ® 
and the measurement of sample guides over wide frequency bands.* 7 

Interesting waveguide communication system layouts have pro- 
posed repeater spacings in the range of 10 to 20 miles. Reasonable 
design requires that the total loss of such a waveguide section be 
predictable to within a few dB. However, the longest guides on which 
measurements have been reported are a few hundred yards, and the 
variation to be expected between different samples of similar con- 
struction is unknown. 

This paper describes measurements made in the 50 GHz to 60 
GHz band on a two-mile triangular loop of helix waveguide. Ex- 
tremely precise observations were made on many sections of the 
loop in order to: 


(1) Test whether the loss of a long line is indeed the sum of the 
losses of its component sections as is expected if the sections act in- 
dependently. 
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(12) Discover the statistical variations between sections, both in av- 
erage loss and loss fluctuations with frequency, so that confidence 
limits can be found for predicting the behavior of very long lines 
from measurements on shorter lines. 

(1) Allow accurate measurements of bends and other components by 
taking the difference between the losses of sections with and without 
the test component included. 


II. THE TWO-MILE WAVEGUIDE LOOP 


The two-mile facility was constructed at Holmdel, New Jersey, 
by A. C. Beck and C. F. P. Rose. The permanent installation con- 
sists of a triangular shaped loop of two parallel 4-inch steel conduits 
buried below the frost line, with poured concrete ties every 10 feet, 
along a precisely aligned path. The layout is shown in Fig. 1. The 
loop begins and ends in a laboratory building, and large waterproof 
access manholes are provided approximately every 400 feet, as in- 
dicated by the letters in Fig. 1. The waveguide was installed in the 
conduit by adding sections in one manhole and pulling the assembled 
guide through the conduit to the next manhole with a cable and 
winch. 

The vertical profile of the path is quite smooth, with no radii of 
curvature less than 4,000 feet. The horizontal plan of the path con- 
sists of straight lines, as shown in Fig. 1, except the two sections 
between manholes U, V, and W. These sections have a constant radius 
of curvature of 708 feet. The angles at the corners of the loop are 90°, 
90°, and 42°. 

The waveguide was two-inch inside diameter steel-jacketed helix 
waveguide. It was constructed at the Holmdel Laboratory by A. C. 
Beck and C. F. P. Rose and has been described by them.’ It was 
made in 15-foot lengths which were connected with precision threaded 
couplings. The guide rests on its couplings in the steel conduit be- 
tween manholes, and is thus supported at 15-foot intervals. A short 
connecting section is provided in each manhole; it is easily removed 
to allow insertion of measuring gear. 

The total added loss at 55 GHz owing to the horizontal and verti- 
cal path bends has been calculated by A. C. Beck to be 0.045 dB 
and 0.002 dB, respectively, for the whole two miles. The former was 
readily measured, the latter was beyond our measurement accuracy. 

To complete the loop, various types of sharp-radius bends were 
placed in the corner manholes H, O, and U. Section IV describes these 
bends and experimental measurements on them. 


HELIX WAVEGUIDE MEASUREMENTS 
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Fig. 1— Layout of experimental Holmdel waveguide. 


The entire waveguide loop, including bends, was made vacuum 
tight and could be evacuated to a pressure of a few microns of 


mercury. All measurements were made with the guide filled to 
slightly over one atmosphere with high purity dry nitrogen. Each 
time the guide was opened to change experimental conditions it was 
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flushed with nitrogen, pumped, then refilled. These precautions are 
necessary to eliminate oxygen, which has several strong absorption 
lines in our band of interest. 


III. MEASURING TECHNIQUES AND DATA REDUCTION 


The TE»; transmission losses of the waveguide sections of interest 
were measured by the shuttle-pulse method. This method allows 
highly accurate measurements on low-loss line sections, provided that 
certain precautions are observed, because it includes observations on 
many round-trip traversals of the section and because the time res- 
olution of the pulse allows spurious reflections to be avoided. 


3.1 Apparatus 


Figure 2 is a block diagram of the measuring setup. It used a 
heterodyne receiver system in which the CW beating oscillator signal 
and the transmitted test pulses are both provided from a single back- 
ward wave oscillator by pulsing the beam voltage every 100 ps with 
a 0.1 ws duration pulse which changes the oscillator frequency by 70 
MHz. This scheme was suggested by D. H. Ring and has been de- 
scribed earlier.° The test pulses and beating oscillator power driving 
the converter are reflected from the coupling mesh. A portion of each 
transmitted signal pulse enters the test section and bounces back 
and forth between the mesh and piston in the test section many times, 
thus causing a train of pulses with decreasing amplitudes to be re- 
turned to the receiver. 

Since the mesh has transmission loss of approximately 17 dB, the 
level of the signal pulses which have traveled in the test section be- 
fore returning to the converter is at least 34 dB below the beating 
oscillator level and good receiver linearity is assured. The trans- 
mitter power and receiver noise figure allowed as many as 100 trips 
to be observed, depending on the length of the test section. 

The 70 MHz IF pulse train passes through a precision attenuator 
adjustable in 0.1 dB steps. The range unit opens a 0.4 ps time gate 
to select a desired pulse from the train. The selected pulse is peak- 
detected and read on an expanded-scale levelmeter. The attenuator 
is set to center the levelmeter, so the entire IF strip operates at con- 
stant level. Readings of relative pulse height may readily be made 
to within 0.05 dB. Measurements are made, after adjusting the BWO 
and converter for the desired frequency, by selecting a series of pulses 
(usually 15 to 20) from the train with the range unit and recording 
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Fig. 2— Block diagram of measuring setup. 
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the trip number and relative pulse height of each. This data, together 
with end corrections for mesh and piston return losses, is reduced by 
a simple computer program. 

The millimeter-wave circuitry is all of precision construction and, 
with the exception of the TE2-TE 8 transducer, is extremely well 
matched. The effects of the 15 to 20 dB return loss of the transducer 
will be discussed later. 

The coupling mesh is a flat transverse copper plate 1/32 inch thick 
with many small uniformly-spaced holes. Care was taken in machin- 
ing both it and its mounting fixture to insure flatness. The shorting 
piston was 14 inch thick solid copper, machined for flatness and 
polished. The return loss of the mesh was precisely measured by sub- 
stituting it for the solid piston in a short test line (terminated beyond 
it) and comparing the relative pulse heights of the 100th trips in the 
two cases. The mesh return loss varies between 0.054 and 0.114 dB 
across the 50-60 GHz band. The return loss of the solid copper pis- 
ton is taken to be the calculated value of 0.005 dB. 

The shuttle pulse technique provided a further important advan- 
tage for the present experiments, where many sections physically 
separated by large distances were to be measured, by allowing the 
test gear to remain in one location. The coupling mesh to the test 
section was placed in any desired manhole around the loop, and the 
shorting piston for the far end of the test section was then placed in 
the appropriate following manhole. 

The waveguide between the building where the test gear was lo- 
cated and the manhole where the coupling mesh was located served 
as a transmission line. Thus measurements could be made on the 
waveguide section between any two manholes chosen by the experi- 
menter simply by locating the mesh and piston appropriately. The 
waveguide between the test gear and the coupling mesh also served 
as a delay line to allow complete time separation between the incident 
pulse and the pulse reflected from the mismatch at the mode trans- 
ducer. 

Unless these pulses are separated the effective return loss of the 
coupling mesh will vary considerably. The measured return loss of 
the coupling mesh is then no longer correct, and this will seriously 
affect the accuracy of measurement of short waveguides. 


3.2 Data Reduction 


The basic assumption in shuttle pulse measurements is that the 
loss of each successive trip through the test section is identical, thus 
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the total loss in decibels is a linear function of trip number. The 
validity of this assumption for our experiment is discussed in Sec- 
tion 3.3. 

In order to obtain high precision, many sets of pulse height level 
vs trip number readings (h;, n:) were taken at each test frequency 
for each test line. To weight the readings equally and to obtain a 
measure of the experimental precision, a straight line was fitted by 
the method of least squares. Thus A and B were chosen such that 
M (A, B) was minimized, where 


"y. 


N 
M(A, B) = D1 (a; — A + Bn)’. 


This requires 


N N 
K De nh oa Q Ss h; 
= i=1 i=1 


7 K*? — NQ 
N N 
N dink ~ K Dh, 
ae K? — NQ 


where 
N 
K = Yn; 
7z=1 


N 
Q= dn, 
4 


and N is the number of data pairs (h;, n;). B is therefore the desired 
experimental loss per trip and A is the intercept at zero trips. A de- 
pends upon the transmitter power level and is of interest mainly as 
an internal check that the data is consistent with other measure- 
ments. The attenuation constant a of the test section is then com- 
puted from 


aff) = 57 BO — CO) (1) 


where Z is the length of the section, B is the measured round trip 
loss and Cis the known end correction. 

If we assume that the pulse height measurements h; are distributed 
normally about the true line A, — B,n; , we can readily calculate the 
standard deviation of B,. our experimental measure of B,, and thus 
the accuracy of our experimental value of a. 
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If we assume the variance A’ of the h, is known so that 
h, = A, — Ban, + A; (2) 
with (A;) = (A,;A,) = 0, except (A?) = A’, where ( ) is the statistical 
expected value, then we can readily calculate 


» _ _ NA’ 
Thus the accuracy of the experimentally determined loss is related to 


the individual measurement variance A’ by equation (3). 
If the variance A’ is unknown then one can use the variable 


i= (B—B,) Ry ae! (4) 


where N, M, K and Q are as previously defined. It can be shown® 
that ¢ has Student’s ¢ distribution with N — 2 degrees of freedom, 





f(t) = Const (1 + = Eee (5) 


From (4) we can write 


NM 5 
Ww — ave — K "? 


and (é) can be evaluated from (5) to give 


NM 
(N — 4)(NQ — K*)’ 


The value of ((B — B,)’) was calculated from (6) for each measure- 
ment. Comparison with (8) over many measurements gives a value 
for A of about 0.05 dB, which is in agreement with the expected limit 
of accuracy of our pulse height measurements. 

The loss in dB per mile as calculated from (1) and the standard 
deviation of the measurement as calculated from (6) were plotted 
versus frequency for each test section by the computer. Some of these 
results are shown in Figs. 3 to 8 and are discussed in detail in Section 4. 

For most test sections, measurements were made at frequencies 
spaced by 100 MHz from 50 to 51 GHz and from 59 to 60 GHz, and 
at frequencies spaced by 1 GHz from 51 to 59 GHz. This arrangement 
allowed a check at the ends of the band on the consistency between 
the calculated deviations and the actual spread of points, and gave 
sufficiently fine-grained data across the band to detect any expected 


(B — B,)’) = 


(B — B,)’) = N > 4. (6) 
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Fig. 3— Measured attenuation of short length waveguide sections. 


variations with frequency. Helix waveguide of the type used in these 
experiments is not expected to show loss variations vs frequency with 
periods less than 6 GHz.° 


3.3 Hxperimental Precautions and Limitations 


There are a variety of precautions that must be observed in shut- 
tle pulse measurements in order to avoid anomalies and inaccuracies. 

Of prime importance for high precision is that there be no interaction 
between different traversals of the signal pulse in the test section. 
Otherwise the observed loss will not be a linear function of the number 
of trips and the desired single trip loss will be difficult to derive. Interac- 
tions can occur in two major ways: (2) between successive trips when 
spurious mode generation is high enough or spurious mode loss is low 
enough that significant spurious mode power can be built up during 
one traversal and then be reconverted to the 7; mode in the next 
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Fig. 4 — Measured attenuation of medium length waveguide sections. 


traversal, and (77) between nonsuccessive trips when the test section 
length Z and the delay line length / are related by mL % nl so that 
pulses bouncing in the delay line as a result of reflections or mode 
conversions at the input transducer can coincide with some of the 
desired signal pulses bouncing in the test section. 

The first type of interaction is readily observed in waveguides 
with low spurious mode loss and has been discussed? in detail. The 
cure in the low-loss case is to provide mode filters at each end of 
the test section. For helix waveguide with high spurious mode loss, 
as in our experiments, it is expected that the spurious mode level is 
never high enough to cause observable interactions for all except 
TEon modes. This expectation was tested in several guide sections at 
several frequencies by using a movable shorting piston and observ- 
ing the signal pulse after many round trips. By moving the shorting 
piston one changes the phases of any reflected spurious modes and 
thus of the reconverted T’Fo,, causing distinctive variations in the 
observed signal pulse height. No variations outside experimental un- 
certainty were observed with the exception of a series of narrow loss 
peaks at the T’E 2 spacing. 
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Fig. 5— Measured attenuation of long length waveguide sections. 


The TE». mode is coupled to TE», by imperfections possessing 
circular symmetry,’° such as diameter changes or slight dishing of the 
mesh or end pistons. Its loss in helix waveguide is very low, so it can 
interact over several trips in short waveguide sections, causing loss 
peaks when the frequency and section length L are such that 2L 
contains an integral number of TE o,-TE 2 beat-wavelengths, or 
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Fig. 6 — Measured attenuation of total straight waveguide. 
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Fig. 8— Bend losses. 
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nearly so.* The diameter tolerance of the helix waveguide was such 
that continuous conversion to TE 2 was not expected to be observable, 
so conversion at the coupling mesh and end piston was suspected as 
the cause. 

This suspicion was verified by the following experiment. The piston 
and mesh were fixed, and the test frequency was varied slowly. Loss 
peaks were observed every 120 MHz although the beat-wavelength 
condition was satisfied every 60 MHz in the test line. Such an effect 
should indeed occur if both mesh and piston are converters of roughly 
similar magnitude. When the coupling mesh was turned around, the 
loss peaks still occurred every 120 MHz but were shifted 60 MHz 
to frequencies between those observed originally, thus indicating the 
expected phase reversal in the coupling at the mesh. Various meshes 
and end-pistons were tried, with similar results. 

It can be shown® that conversions 50 dB down at each end of the 
test section will cause 10 per cent additional loss at the loss-peak 
frequencies in a 450-foot section; to reduce this number significantly 
requires flatness beyond that obtainable with simple machining tech- 
niques. For our experiments, therefore, we selected the best mesh and 
piston available, and chose test frequencies which avoided the loss 
peaks. This precaution was unnecessary in sections 1500 feet or more 
long, as the extra peak loss was then below experimental uncertainty. 

The second type of intertrip interaction was avoided by identifying 
and observing the spurious pulse trains arising from the delay line. 
Both J'Eo;, which is reflected for several trips with rapidly decreas- 
ing amplitude because of the mismatch at the transducer, and TE ., 
which is generated at a low level in the taper to the transducer but 
is then almost totally reflected from taper and mesh on successive 
trips, are important. Certain test section-delay line combinations 
with nearly rational length ratios were not measured because these 
effects were observable. In general they become less important as the 
delay line length (and therefore its loss) increases. 

A third possible cause of nonlinearity between pulse height and 
trip number is the receiver down-converter noise. This effect was 
observable only after many trips when the signal pulse was much at- 
tenuated and the receiver attenuator was set near zero; it was easily 
avoided by monitoring the signal-to-noise ratio. 

Two other major sources of inaccuracy are oscillator stability and 
oxygen absorption. At one atmosphere pressure, contamination of the 
nitrogen filling gas by 0.02 percent of oxygen will increase the meas- 
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ured loss at 60 GHz by approximately 1 per cent. Thus the elaborate 
flushing procedures mentioned earlier were followed. 

The oscillator frequency stability must be sufficient to hold the 
beating oscillator level at the receiver down-converter constant dur- 
ing a measurement run. The level will vary with frequency because 
the main return from the mesh at the end of the delay line will phase 
with the reflection from the transducer mismatch. In addition, the 
return from the mesh will change when the test line is in the vicinity 
of resonance for the beating oscillator frequency. These effects be- 
come severe as the lengths L and I become large. In the present ex- 
periments the BWO beam supply was regulated to a few millivolts, 
giving frequency stability of a few tens of kHz, but for lengths of 
either delay or test line of over 1000 feet it was necessary to monitor 
the converter crystal current very carefully to avoid serious loss of 
precision, and for lengths over 5,000 feet precise measurements be- 
came difficult. 


IV. RESULTS AND COMPARISONS 


4.1 Individual Line Sections 


The measured attenuation constants vs frequency for several line 
sections are plotted in Figs. 3, 4, and 5, grouped roughly by length. 
The results for sections* AB, BC, CD, and DE, all of which are under 
500 feet long, are plotted together in Fig. 3. Sections AD, DH, HL, 
LO, UR, and RO, from 1289 to 1952 feet long, are shown in Fig. 4. 
Results for the three long straight runs AH, HO and UO, all around 
3000 feet long, are shown in Fig. 5. Notice that in all cases the verti- 
cal scales are greatly expanded. 

On each figure is indicated the estimated standard deviation of the 
experimental points, ((@ — «;)?)/, as calculated from equations (1) 
and (6). The actual value of this quantity of course varied some- 
what from point to point and curve to curve; the indicated amount 
is a rough average. In general the actual value tended to be a bit 
larger at the lower frequencies in the band and smaller at the higher 
frequencies, because of the greater number of trips observable at 
lower attenuations. 

The over-all high quality of the helix waveguide is evidenced by 
the low observed attenuation constants. The theoretical loss for per- 

*The first letter in the section code refers to the manhole in which the 


coupling mesh was located and the second to the manhole with the piston. 
Manhole locations are indicated in Fig. 1. 
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fect. solid-copper guide varies from 1.79 dB per mile at 50 GHz to 
1.35 dB per mile at 60 GHz. Thus the additional losses from all 
causes, including finite helix-wire size and pitch, surface roughness, 
and manufacturing tolerances, total less than 1 dB per mile for most 
sections. 

The rapid variations in loss vs frequency for each section are 
within the estimated experimental error in most cases. As mentioned 
earlier, in this waveguide we would expect to see no variations vs fre- 
quency with periods less than 6 GHz. None were observed, except the 
spurious To. peaks discussed in Section 3.38, and a peak at 54.3 GHz 
in line LO which is believed to be from a mechanical failure of the 
steel jacket-lossy lining bond in some of the helix waveguide pieces. 
Experiments over much wider frequency bands would be necessary 
to detect the very slow variations which are expected from the ran- 
dom curvature of the waveguide axis. 

On the other hand, the difference in measured loss between one 
line section and another of roughly the same length is much greater 
than experimental error is most cases, and is therefore quite real. 
This difference is discussed in detail in Section 4.4, where it is com- 
pared with a theoretical estimate. It results from the statistical in- 
dependence of the loss components between one section and the next; 
the variations vs frequency for a single section should be as great 
over frequency differences large enough that the statistical inde- 
pendence again holds. 

Figure 6 shows the average attenuation constant for all of the hori- 
zontally straight line sections, obtained by adding the measured 
losses for sections AH, HO and UO and dividing by their total length. 
Figure 7 shows the average attenuation constant for the entire loop 
including sharp bends in the corner manholes. The mesh was in man- 
hole A and the shorting piston in the laboratory building at the other 
end of the waveguide loop; thus everything was included except the 
short delay-line section between the building and A. 


4.2 Bend Losses 


The losses of several models of sharp bends for use in the corner 
manholes were measured by taking the difference between the losses 
of line sections with and without the bends included. The coupling 
mesh was placed in an appropriate manhole ahead of the corner 
manhole, and the shorting piston was placed in the corner manhole, 
first following the bend and then preceding it. In the measurement 
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with bend included, for nonhelix bends, a short section of helix wave- 
guide was placed between the bend and the piston to serve as a mode 
filter. 

The losses of bends 2, 3, and 4 are shown in Fig. 8. These bends 
were used in manholes H, O, and U, respectively, for the measure- 
ment shown in Fig. 7. Bend 2 is made of two 90° mitered elbows 
back to back, with a rotary joint between them adjusted to give the 
42° horizontal angle. The measured loss agrees well with theory." 
Bends 3 and 4 are % inch inside diameter helix waveguide with lossy 
jacket, bent 90° on elastically tapered curves, with effective bend 
radii of about 3 meters. The loss of bend 3 is in agreement with 
theory; that of bend 4 is considerably higher. 

For all three bends the measurement accuracy is a few hundredths 
of one dB, thus the plotted variations vs frequency for bends 2 and 
4 are real. For bend 2, some phasing between spurious modes gen- 
erated at the two elbows is to be expected, but for bend 4 the varia- 
tions further indicate that the helix waveguide was not properly 
constructed. 

Figure 9 shows the measured attenuation of section XU, which 
contains the 708-foot radius horizontal bend. It also shows the pre- 
dicted straight loss of XU, obtained by subtracting the calculated 
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Fig. 9 — Added loss caused by large radius of curvature bend, 
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bend loss from the measured loss. The agreement with the measured 
losses for other straight sections as shown in Figs. 3 through 6, indi- 
cates that the effect of the horizontal bend is well predicted by theory. 


4.3 Sums of Sections and Residual Errors 


An important purpose of these experiments was to determine 
whether the sum of the losses of several sections measured individually 
would be the same as the loss measured for a line made up of the 
same sections connected. The assumption that this is indeed true is 
inherent in all predictions of the losses of long lines based on measure- 
ments on short lines. It is also inherent in our technique for measuring 
bends, and it underlies our assumption of the validity of the shuttle- 
pulse technique in general. Thus, although there were no known 
reasons to expect the assumption to be false, an experimental verifica- 
tion was considered important. 

Figure 10 shows the difference in dB between the measured loss 
of section AD and the sum of the measured losses of sections AB, 
BC, and CD, as a function of frequency. The differences are very 
small indeed. The dashed lines indicate the average across frequency 
of the estimated standard deviation of the differences about zero as 
calculated from the sum of the mean square errors of the individual 
measurements as given by (6). The dashed lines thus indicate only 
the effect of the scatter of the data points and do not include any 
effects such as long-term drift of the apparatus between measure- 
ments, variations in oxygen contamination between sections, residual 
tails of the spurious T’Eo2 loss peaks, or absolute errors such as in 
the end correction due to mesh and piston. 

The seatter in Fig. 10 of the experimental differences is therefore 
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Fig. 10 — Residual dB for difference AD—(AB+4+BC+CD). The dashed lines 
are the estimated standard deviations about zero owing to measurement varia- 
tions only. 
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quite satisfactory. The measured loss of section AD is about 0.6 dB 
and of its shorter component sections about 0.2 dB; the largest 
observed difference is thus just 2 per cent of the loss of AD, and 
about half of the observed differences are within 1 per cent. All of 
the differences would be shifted a constant 0.004 dB, or 2/3 per cent 
of the loss of AD, by a fixed absolute error of 0.002 dB in all meas- 
urements. That amount is roughly the limit of accuracy of the 
measurement of the mesh and piston end correction. In addition, 
oxygen contamination would cause an error rising from zero at 50 
GHz to 1 per cent at 60 GHz in any section with 0.02 per cent oxygen 
from improper flushing or filling. 

The addition of longer sections, where the end correction is unim- 
portant, is shown in Fig. 11. Here the difference is between the loss of 
section AO and the sum of the losses of sections AH and HO and of 
bend 2. Bend 2 was itself measured by taking the difference of the losses 
of section GH with and without the bend included. The dashed lines 
are again the estimated standard deviation about zero as calculated from 
data point scatter only. The loss of section AO is about 3.6 dB, so the 
dashed lines are at slightly over +1 per cent. The experimental points 
fall quite satisfactorily within them. 

Other additions were checked with similar results. The direct measure- 
ments made during these experiments are thus believed to be accurate 
to the order of about -+:1 per cent or +0.005 dB, whichever is greater, 
and the sums of losses of individual sections are the same as the loss 
of the sum of the sections to within that measurement accuracy. 


4.4 Statistical Confidence Limits 


A further purpose of these experiments was to determine experi- 
mentally the variation in attenuation between different waveguide 
sections and to try to discover the length of guide that must be 
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Fig. 11 — Residual dB for difference AO—(AH+HO-+bend 2). 
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measured in order to obtain estimates of a given accuracy on the 
loss of very long waveguide runs. We assume that the loss variations 
among sections are caused by variations in the mode conversion in the 
different sections and are thus determined by a random process whose 
statistics are related to the statistics of the mechanical tolerances of 
the guide.* > 


A theoretical solution for the confidence limits on loss as a function 
of sample length requires knowledge of the probability function for the 
additional loss caused by mode conversion. An exact solution is difficult 
when the differential loss between coupled modes is nonzero. An ap- 
proximate solution for two modes and two polarizations is given in 
the Appendix; it predicts a normal distribution with mean unity and 
variance 1/(4 | Aa | Z) for the quantity A/(A). Here A is the additional 
loss caused by mode conversion and is thus the difference between actual 
loss and theoretical heat loss. (A) is its expected value. 

In Fig. 12 the +2o lines for the predicted theoretical distribution 
are plotted as a function of line length along with experimental values 
of A/{A) for all line sections measured. The experimental value of (A) 
was derived from the curve shown in Fig. 6, so is itself subject to experi- 
mental error. The values of A/{A) plotted for each line are the means 
of the maximum and minimum values observed vs frequency. 

The fit between theory and experiment would be better if the plotted 
curves were at +1lo instead of +2c. However, the approximations of 
the theory, which includes only one spurious mode, and the experimental 
accuracies of the points are probably sufficient causes for the poor agree- 
ment. In addition the manufacturing variations in our virtually hand- 
made waveguide may be considerable. It should be remembered that 
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the quantity A is the additional loss only, and the value of (A)/Z is 
less than one dB per mile. Thus less than 4 per cent variation in the 
observed total loss will cause a 10 per cent variation in A. The experi- 
mental errors are similarly magnified. 

Assuming that the theoretically predicted variance of A is correct, 
one needs to measure 2000 feet of our present waveguide to assure 95 
per cent confidence that the measurement is within 5 per cent of the 
true value of A or thus within 2 per cent of the true value of attenua- 
tion constant a. Two per cent gives the loss of a 20-mile section to 
+1 dB. If the variance were twice that predicted, one would need to 
measure four times as much guide, or 8000 feet, for the same confidence. 


V. SUMMARY 


Precise measurements have been made across the 50 GHz to 60 GHz 
band on many sections of a two-mile long loop of two-inch inside diameter 
helix waveguide. The measurement accuracy is approximately -:1 per 
cent or -+-0.005 dB, whichever is greater. 

The waveguide is of high quality; the average measured attenua- 
tion varies smoothly from 2.62 dB per mile at 50 GHz to 2.32 dB 
per mile at 60 GHz. Fast variations vs frequency were within ex- 
perimental error. 

The losses of several long sections were compared with the sum 
of the losses of the smaller sections of which they were composed; 
the agreement was excellent and within experimental error. Several 
short-radius bends of different angles were included in the line and 
measured; losses less than 0.8 dB across the band were observed for 
a 42° bend made of two mitered elbows. 

Differences among line sections in the values of their measured 
losses were considerably greater than variations vs frequency for 
any one section, as was expected. It was found that quantities of 
waveguide from 2000 to 8000 feet must be measured in order to be 
95 per cent assured that the measurement is typical of the population 
to within 2 per cent. 


APPENDIX 
Approximate Confidence Limits for the Variations Between 
Guide Sections 


For the case of one spurious mode with nonzero differential loss 
Aa, Young has shown® that the additional loss in a guide of length 
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L is given by convolving the expression for the additional loss when 
Aa = 0 with a particular loss function. Thus 


AW = [| BU - 940) ds, (7) 
where Ag(t) is the additional loss if Aw = 0, and where B is the 
function, 


2 1 
| Aa | (22 ) 
1+ aa 


The variable ¢ is most conveniently taken as t = AB/2z, where Af 
is the differential phase constant between signal and spurious modes. 
t is thus roughly proportional to the wavelength A». The function Ao 
has been extensively studied by Rowe and Warters,> who show that 
under reasonable restrictions it is a band-limited function and can 
thus be expressed by its values at its sample points, which are 
spaced by 





B() = (8) 


ty 
OL 


If the convolution function B is much broader than At, meaning that 
| t/AaL| « 1, we can approximate A» by constant line segments 
through its sample-point values, and can estimate the convolution 
integral (7) as a summation over the sample points. This gives 
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At (9) 





A(t) = )) Bit — s,)Ao(s,) As (10) 
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8; = AB,/2r = 7/2L. 
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For convenience we study A(A8/27) at the Nth sample-point (ABy) /2z 
= N/2L. After substituting n = N-—i in the summation, we have 
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where 


AaL 


vs 


No = 


Irom our earlier requirement on the width of the loss function B, we 
require %, > 1. 

If the coupling coefficient between signal and spurious mode is ex- 
pressed as a complex Fourier series for the length L, the additional 
loss Ao is simply expressed in terms of the Fourier coefficients.® For 
two polarizations of the spurious mode, 


We. oe 
Ad) = 5 Le Tid), (13) 
where 


L@ = Eb > Ky(—y ese —2. 

a(t — n) 
The index n denotes the nth Fourier coefficient; the index k separate: 
the real and imaginary parts of the coefficients for the two polariz?- 
tions and thus has four possible values. If the x and y components 
of the mechanical imperfections are independent random Gaussian 
variables with white power spectrum, then so are the K;,, at least for 
large L and over small percentage bandwidths.®> Under these assump- 
tions one finds that 


(1!) 


(Ao(ty)) = 2L(Ky) = 2L*(K°) (15) 
and 
(Ao(ty)Aoltm)) = 6LNK’), M=WN 
S4IKXKY, MEAN. (16) 


Expressions (15) and (16) are then used to calculate the lower-order 
statistics of the loss function A from (12), giving 


(A(ty)) = 2L*(K") = (Ao(ty)) (17) 
(5A*(ix)) = (A — (A))”) 


pe Ballad | (18) 


The requirement | AaL | > 7 has been used to simplify the expressions. 
Since Ag is a sum of squares of samples from a Gaussian process, 
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and since A is a weighted sum of values of Ag, it seems reasonable 
that the distribution function for A should be close to a chi-squared 
distribution with appropriate normalization. However, for large AaL 
the approximate chi-squared distribution will have many degrees of 
freedom, approaching a normal distribution. Therefore, for large Aa 
the variable A/{A) becomes normally distributed, with unit mean 
and variance 1/(4| AeL |). For this distribution the 95 per cent con- 
fidence limits are the +2o lines. 


A 1 
(A) +20 7 : = e (19) 


| AaL 
The lines are plotted, together with the experimental observations on 
various waveguide sections of different lengths, in Fig. 12. The value 
of Aa used is —0.184 neper per foot, which is typical of the differential 
TE,» loss in lossy-jacketed helix waveguide. For sections with L greater 
than 300 feet, | Aa | is greater than 55, so the approximation | AaL | >> 
is well satisfied. 
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A Statistical Theory of Mobile-Radio 


Reception 


By R. H. CLARKE 


The statistical characteristics of the fields and signals in the reception 
of radio frequencies by a moving vehicle are deduced from a scattering propa- 
gation model. The model assumes that the field incident on the receiver 
antenna ts composed of randomly phased azimuthal plane waves of arbi- 
trary azimuth angles. Amplitude and phase distributions and spatial 
correlations of fields and signals are deduced, and a simple direct rela- 
tionship is established between the signal amplitude spectrum and_ the 
product of the incident plane waves’ angular distribution and the azimuthal 
antenna gain. 

The coherence of two mobile-radio signals of different frequencies 1s 
shown to depend on the statistical distribution of the relative time delays 
in the arrival of the component waves, and the coherent bandwidth 1s shown 
to be the inverse of the spread in time delays. 

Wherever possible theoretical predictions are compared with the experi- 
mental results. There ts sufficient agreement to indicate the validity of the 
approach. Agreement improves tf allowance ts made for the nonstationary 
character of mobile-radio signals. 


I. INTRODUCTION 


In a typical mobile-radio situation one station is fixed in position 
while the other is moving, usually in such a way that the direct line 
between transmitter and receiver is obstructed by buildings. At ultra- 
high frequencies and above, therefore, the mode of propagation of the 
electromagnetic energy from transmitter to receiver will be largely 
by way of scattering, either by reflection from the flat sides of build- 
ings or by diffraction around such buildings or other man-made or 
natural obstacles. 


1.1 The Model 


It therefore seems reasonable to suppose that at any point the 
received field is made up of a number of generally horizontally trav- 
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eling free-space plane waves whose azimuthal angles of arrival occur 
at random for different positions of the receiver, and whose phases 
are completely random such that the phase is rectangularly distributed 
throughout 0 to 27. The phase and angle of arrival of each component 
wave will be assumed to be statistically independent. The probability 
density function p(a) which gives the probability p(«)da that a com- 
ponent plane wave will occur in the azimuthal sector from a to a + de 
will not be specified, since it will be different for different environ- 
ments, and is also likely to vary from region to region within one 
environment; but the assumption that the phase ¢ has a rectangular 
probability density function throughout 0 to 27 will be made in all 
cases. 

For simplicity, it will be assumed that at every point there are 
exactly N component waves and that these N waves have the same 
amplitude. In addition it will be assumed that the transmitted radia- 
tion is vertically polarized, that is, with the electric-field vector di- 
rected vertically, and that the polarization is unchanged on scattering 
so that the received field is also vertically polarized. 

The model described so far gives what might be termed the “scat- 
tered field,” since the energy arrives at the receiver by way of a 
number of indirect paths. Another term for this scattered field is the 
“incoherent field,” because its phase is completely random. Some- 
times a significant fraction of the total received energy arrives by 
way of the direct line-of-sight path from transmitter to receiver. The 
phase of the “direct wave” is nonrandom and it may therefore be 
described as a “coherent wave.” It will be seen later that the field 
in a heavily built-up area such as New York City is entirely of the 
scattered type, whereas the field in a suburban area with the trans- 
mitter not more than a mile or two distant is often a combination of 
a scattered field with a direct wave. 


1.2 Comparison With Other Proposed Models 


J. F. Ossanna’ was the first to attempt an explanation of the sta- 
tistical character of the received mobile-radio signal in terms of a set 
of interfering waves. He was concerned with measurements taken in 
a suburban environment, and assumed that reflection occurred at the 
flat sides of houses and that the incident and reflected waves form an 
interference pattern through which the receiver moves. He then as- 
sumed that all orientations of the sides of houses are equally likely, 
and hence obtained spectra for the randomly fading signal with the 
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angle between the direction of vehicle motion and the direction to 
the transmitter as a parameter. 

There is quite good agreement between Ossanna’s theoretical spectra 
and those derived from measurements on several suburban streets 
situated within 2 miles of the transmitter. There is marked disagree- 
ment, however, at very low frequencies and at frequencies in the 
region of the sharp cut-off associated with the maximum Doppler 
frequency shift. At very low frequencies the spectral energy is al- 
ways observed to be higher than that predicted by theory, whether 
Ossanna’s or the one we use in this paper. The reason for this is that 
neither theoretical model takes into account the large-scale varia- 
tions in total energy which result from the changing topography 
between transmitter and mobile receiver. 

The basic difference between Ossanna’s theoretical model and the 
model used here is that the former is essentially a reflection model 
whereas the latter is essentially a scattering model and so includes 
the former as a special case. An example of the limitations of the 
reflection model can be seen from the experimental spectra plotted 
in Ossanna’s paper. The spectra are derived from signal-fading rec- 
ords made on several streets whose inclination to the transmitter 
direction ranged from 15 degrees to 84 degrees, and in each case 
there is evidence of a shelf which cuts off at twice the maximum 
Doppler frequency shift. Ignoring the higher harmonics generated in 
the detection process, the reflection model predicts a spectral cutoff 
which depends on the direction of the street with respect to the trans- 
mitter, ranging from the maximum Doppler frequency shift itself 
when the street is at right angles to the transmitter direction to twice 
that value when the street is in line with the transmitter. 

With the scattering model, on the other hand, the angular distribu- 
tion p(a) of scattered waves can be chosen to predict the existence 
of a spectral shelf out to twice the Doppler frequency shift for any 
street direction. Another feature of the reflection model which makes 
it rather inflexible is that for every randomly oriented reflected wave 
there exists a direct wave incident on the mobile receiver and carry- 
ing the same power. Thus the ratio of coherent to incoherent power 
in the received signal is fixed, whereas in the scattering model this 
ratio is arbitrary and may be adjusted according to the environment. 

In his study of energy reception in mobile radio, E. N. Gilbert? 
examined several models of the scattering type and established a 
number of important relationships between them. One feature com- 
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mon to all of them, however, was the uniform distribution of waves 
in angle, although he briefly mentioned the effect of a single strong 
component arriving directly from the transmitter. The first model 
Gilbert considered was that of N waves arriving from fixed directions, 
equally spaced in angle. The phases of the waves were assumed to be 
independent and uniformly distributed throughout 0 to 27; their 
amplitudes were assumed to be Rayleigh distributed and independent, 
but with the same variance. In a second model the angles of arrival 
were allowed to occur at random with equal probability for any 
direction; the phases were again completely random but the ampli- 
tudes were assumed to be constant. (This model is the same as the 
one we use in this paper, with the restriction that p[a] = [27]-.) A 
third model was an extension of the second to include the case of an 
arbitrary distribution of the amplitudes. Gilbert showed that the 
second and third models were equivalent to the first for sufficiently 
large N. 


1.3 Scope 


This paper shows that the scattering model can be used to predict 
the statistical characteristics of the signal received at the antenna 
terminals, hence at the output of a square-law or envelope detector, 
of the mobile receiving vehicle. These characteristics include the 
probability distributions of amplitude and phase, spatial correlations, 
amplitude spectra, and frequency correlations. 

A simple relationship is established between the spectrum of the 
signal input and the product of the azimuthal power gain g(a) of the 
antenna and the probability distribution function p(a) of the angle 
of arrival of the component waves. This relationship will be particu- 
larly useful in analyzing mobile-radio systems with directional an- 
tennas on the mobile unit. 

Other topics discussed are the use of space and frequency diversity, 
coherent bandwidth, and random frequency modulation. Some com- 
ments also are made on the nonstationary aspects of mobile-radio 
fields and on the consequent need for their characterization in terms 
which will be useful to the mobile-radio system designer. Whenever 
possible the theory is discussed in the light of available experiments. 


Il. FIRST-ORDER STATISTICS OF THE FIELD 


2.1 Theory 


Under the assumption that the total field at any receiving point 
is vertically polarized and is composed of the superposition of N 
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waves, the n** wave arriving at any angle a, to the x axis (Fig. 1) 
with phase ,, the field components at point 0 (the zero phase refer- 
ence point) are 


N 
E, = Ey >) exp {jen} (1) 
n=1 
7 N 
H, = — 7 di sina, exp {jen} (2) 
n=1 
Ky N 
= zs > cosa, exp {jen}. (3) 
n=1 


In these equations Hy is the common (real) amplitude of the NV waves 
and » is the intrinsic impedance of free space. The time variation is 
understood to be of the form exp{jwt}. Notice that H, will be propor- 
tional to the signal input to the receiver when a vertical dipole an- 
tenna is used, and that H., H,, and H, will be proportional to the 
three inputs from a Pierce antenna system.? 

The three field components #;, H,, and H, are complex Gaussian 
random variables, to a good approximation, provided that N is suf- 
ficiently large. This is a consequence of the Central Limit Theorem 
and the assumption that the phases ¢, are independent of each other 
and of the angles of arrival a,. Thus each field component has a real 
part and an imaginary part which are approximately zero-mean Gaus- 
sian random variables of equal variance, the approximation improv- 
ing for larger N, and provided that the phases ¢, are rectangularly 
distributed throughout 0 to 27. Appendix A shows that under the 
same assumptions the real and imaginary parts of each field com- 
ponent are uncorrelated; they are therefore approximately statistically 
independent.? 

An important consequence of this is that the envelope of all three 
field components (hence of the signals at the terminals of a vertical 


DIRECTION OF THE 
2TH COMPONENT WAVE 





Fig. 1— A typical component wave and the two field points 0 and 0’. 
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dipole antenna and of two orthogonal, vertical loops) will be Ray- 
leigh distributed; and their phases will be rectangularly distributed 
throughout 0 to 27. (See pp. 160-161 of Ref. 3.) 

If, in addition to the N scattered waves, there is a wave of sig- 
nificant magnitude arriving directly from the transmitter, the result- 
ing envelope and phase will no longer be respectively Rayleigh and 
rectangularly distributed. The relevant distributions will then be 
those derived by Rice* for a sine wave plus random noise. These 
distributions are, in general, quite complicated (see pp. 165-167 of 
Ref. 3), but in the limit, when the power in the direct wave is con- 
siderably greater than that in the combined scattered waves, both 
the phase and the envelope are approximately Gaussian distributed; 
the phase with zero mean and the envelope with a mean value equal 
to the amplitude of the direct wave. 


2.2 Experiment 


W. R. Young® has found that the Rayleigh distribution gives an 
excellent fit to the observed amplitude fluctuations in mobile-radio 
reception at 150, 450, 900, and 3700 MHz in New York City, pro- 
vided that the sample area is less than about 1000 feet square. Tri- 
fonov, Budko, and Zutov, in a review of several investigations at 50, 
150, and 800 MHz, also found that the Rayleigh distribution fits the 
data measured in rural suburbs at distances of about 5 and 9 km 
from the transmitter.° The fact that the measured distributions are 
Rayleigh in the above situations implies that there is no significant 
directly transmitted component and the fields are wholly of the 
scattered type, which seems physically reasonable. 

Trifonov and his colleagues also found that for short transmission 
distances in towns (about 1 km), the signal amplitude has a non- 
zero-mean Gaussian distribution; and that for a transmission distance 
of 11 km in woodland, the signal has a Rice distribution. In these 
two cases there is apparently a significant direct component wave, 
and in the first case, where the transmission distance is only 1 km, 
the power in the direct component is considerably greater than that 
in the combined scattered components. 

W. C. Jakes and D. O. Reudink have compared the statistical 
character of the amplitude of the fluctuating signal at the two fre- 
quencies of 8836 MHz and 11200 MHz on the same street in a suburban 
environment at about 4 km from the transmitter. They find that the 
signal amplitudes are Rayleigh distributed at both frequencies, again 
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indicating that the direct wave is not significant.? This conclusion is 
borne out, for reasons discussed in Section 3.2.3, by the shape of the 
amplitude spectra which were computed from the same data. 

The particular section of data which Jakes and Reudink analyzed 
was chosen with some care. The criterion of choice was that the data 
should “look” statistically uniform, and although this criterion is 
both arbitrary and subjective, it is important that it be applied in 
the absence of any other satisfactory criterion. The point is well il- 
lustrated by Fig. 2, which shows a section of signal-amplitude data 
at 886 MHz, obtained with a vertical dipole on a street adjacent to 
that used by Jakes and Reudink. The speed of the mobile receiver was 
22 feet per second, and each of the five frames lasts about a second 
(time scale horizontal). The vertical scale is approximately linear 
in dB, covering a 70 dB spread with about 7 dB to each vertical 
division. 

There is an obvious change in the statistics of the received signal 
in the fourth frame, compared with the others. (In fact, the fourth 
frame corresponds to the position of a street intersection, with one 
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Fig. 2—Section of a mobile radio data run, showing the variation of signal 
amplitude with time. (One vertical division is approximately 7 dB, and one hori- 
zontal frame is approximately 1 second.) 
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of the intersecting streets pointing in the direction of the transmitter. 
Then, according to the arguments used above, there will be a strong 
direct component which will raise the average signal level and change 
the distribution from a Rayleigh to a Rice or even a Gaussian. The 
average signal level in the fourth frame does rise, and the distribution 
does appear to be more symmetrical.) Using all five frames to estimate 
the probability density function would therefore be misleading in 
this case since obviously different parts of the data are samples of 
different distributions. 

More subtle differences, as when the distributions underlying the 
data are all Rayleigh but with different variances over different parts 
of the run, can be equally misleading. Young found that whereas over 
fairly small areas of New York City the signal amplitude was accu- 
rately described by a Rayleigh distribution, over larger areas—even 
when the path of the receiver was roughly concentric with the trans- 
mitter—the data did not fit a Rayleigh distribution. This is examined 
in greater detail in Section VI. 


Ill. SPATIAL CORRELATION OF FIELDS 


3.1 Theory - 


The field components at some point 0 (see Fig. 1) in the mobile-radio 
field are given by equations (1), (2), and (8). At another point 0’, a 
distance away from 0 in the z-direction, the phase of the nt component 
wave will no longer by ¢, but ¢, -+ ké cos a, , where k = 27/d is the 
free-space phase constant. In the case of the electric field, the product 
of the complex conjugate of HZ’, (the field at 0) with ZH’ (the field at 0’) is 


N N 
EXE! = Eo D1 exp {—jen} 2, exp (i, + ké cos a,)} 
n=1 n=1 (4) 
N N 
= Ed » exp {i(mn — ¢n)} exp {jkE cos a}. 
Taking the average (that is, expectation) of both sides of equation 
(4), the autocovariance function of the electric field is 


Ref) = (E*E) : 
N N 5 
= BS DS (exp {jon — on)})av (EXP {JhE COS Om} Jar « 


n=1 m=1 
The angular parentheses denote “the average of” the quantity they 
enclose, and in this case may be thought of as an ensemble average 
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over all the possible situations implied by the assumed statistics of 
y and a. The right-hand side is written as the product of two separate 
averages because of the statistical independence of » and a. The first 
of these averages is zero except when m = n, so that 


Rx {é) = EG > (exp {jké cos an} av (6) 
— NR | ” fe BRB GREGG ae (7) 


In the particular case when the N waves can arrive from any 
direction with equal probability, 


I 
pla) = 5 —rSaS+z, (8) 
the spatial autocovariance function of the electric field becomes 
Rz() = (EtE2) ww = NEoS (ke). (9) 


The spatial autocovariance functions for the two components H, 
and H, of the magnetic field can similarly be shown to be 





Rue) = (H*HD w= Ft Wolke) + Jo(ke)] (10) 
and 
Reu(@) = (HSH dov = Fo Tolle) — Jo(he) (11) 


for waves arriving from any direction with equal probability. J,( ) 
and Jo( ) are, respectively, the zero- and second-order Bessel func- 
tions of the first kind. The autocovariance functions (9), (10), and 
(11) are plotted in Fig. 3. 

For the same probability density function p(a) of the equation (8) 
it can be shown in a similar manner that the cross-correlations of 
the field components are given by the following covariance functions. 





Rev) = (tH) = 0 (12) 
Reanl®) = (EAH Dav = 5°58 TAR) = Ewe (18) 
Ruy.u,&) = (TT aw = 0. (14) 


These equations show that all three field components are uncorrelated 
and therefore independent, since the fields are Gaussian at zero spatial 
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Fig. 3—— The normalized autocovariance functions pr,(£), py,(), and py,(£) 
from equations (9), (10) and (11). 


separation. Further, #,, H, and H,, H, are uncorrelated and inde- 
pendent for all spatial separations, whereas E,, H, are correlated— 
except at spatial separations corresponding to the zeros of Ji( ), the 
first-order Bessel function of the first kind. The normalized covari- 
ance function for F#, and H, is plotted in Fig. 4. 

The autocovariance functions (9), (10), and (11) and the covari- 
ance functions (12), (13), and (14) are for the particular case of 
p(a) uniform in the interval —z to +z. The autocovariance and 
covariance functions for any p(«) can be obtained from equation (7) 
and similar equations, but those derived here are useful illustrations 
as well as useful approximations in practice. 

In any practical case, however, the complex field components E,, 
H,, and H, cannot be measured. But their magnitude (that is, en- 
velope or squared magnitude, that is, energy) can. Appendix B shows 
that the normalized autocovariance function of the departure from 
their mean of the squared magnitude of complex Gaussian random 
variables, such as the field components #,, H,, and H,, is equal to the 
square of the normalized autocovariance function of the complex 
random variable itself. Taking the electric field H, as an example, 
the normalized autocovariance function of the departure 8 | E, |? of 
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the squared modulus from its mean is, from equation (9), 


pin.) = So(ké). (15) 
Similar normalized autocovariance functions and covariance func- 
tions for the squared magnitude of all three field components can be 
obtained from equations (10) through (14), and they can be shown 
to agree with the theoretical energy density correlations obtained by 
Gilbert.2 This agreement was to be expected since energy density is 
derived from the squared magnitude of the field components; in addi- 
tion, Gilbert used a theoretical model which is equivalent to that 
used here with uniform (a). 

With regard to the envelope of each of the complex field compo- 
nents, Appendix B also shows that the departure of the magnitude 
of such complex random variables from their mean is described by 
a normalized autocovariance function which is to a good approxima- 
tion equal to the square of the normalized autocovariance function 
of the complex random variable itself. Thus, in the case of the electric 
field component #,, again from equation (9), 


psiz.i€) = Jo(ké). (16) 
(This quantity is also the normalized correlation coefficient of the 
signal envelopes at the terminals of two vertical monopole antennas 
€ apart on the mobile receiving vehicle which is traveling through an 
isotropically scattered field.) Similar normalized autocovariance and 
covariance functions for the magnitudes of all three components can 
be obtained from equations (10) through (14). 


NORMALIZED COVARIANCE 





Fig. 4— The normalized covariance function pz,x,(), from equation (13). 
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3.2 Experiment 


3.2.1 Spatial Diversity 


Only indirect experimental evidence is available at this time on 
the spatial correlation of mobile-radio fields. In his measurements of 
the predetection combining of the signals from several equally spaced 
vertical monopole antennas, A. J. Rustako found that there was very 
little difference between the cumulative distributions of the com- 
bined amplitudes from four antennas spaced 1/4, 3/4, and 5/4 wave- 
lengths apart. Equation (16) indicates that the correlation coef- 
ficients of the signal amplitudes at the antenna terminals at these 
three separations are about 0.25, 0.06, and 0.03, respectively. Bren- 
nan has shown that such correlations produce very little difference 
in the combined signal from two channels,® and so the difference is 
presumably even less with four channels combined. 


3.2.2 Field Diversity 


Equations (12), (18), and (14) show that all three field compo- 
nents are uncorrelated (and therefore independent, because they are 
complex Gaussian random variables) at zero separation. The possi- 
bility of a “field diversity” system arising from this fact is exploited 
in the energy density reception scheme from Pierce.? (An alternate 
scheme, proposed by W. C. Jakes, would use predetection combining.’® 
This has the advantage that the modulation is not affected.) W. C.-Y. 
Lee has devised and constructed an energy-density antenna’? and 
his analysis of the measurements,!* based on Gilbert’s isotropic 
scattering model, show sufficient agreement with theory to indirectly 
confirm equations (12), (13), and (14) até=0. 


3.2.3 Frequency Spectra 


If the mobile receiving vehicle is moving with velocity V in the 
x direction, the spatial displacement € and the corresponding time 
displacement 7 are related by 


&= Vr. (17) 


Then all the spatial correlations derived in Section 3.1 can be trans- 
formed into time correlations by using equation (17). The Fourier 
transform of the time autocovariance function then yields the fre- 
quency spectrum. 

In the case of the signal at the terminals of a vertical monopole 
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antenna in an isotropically scattered field, equations (9) and (17) give 
the normalized time autocovariance function as 

pz (t) = Jo(kVr). (18) 
The corresponding input spectrum (see Ref. 3, p. 104) is given by 


io} 


Se.) = | s.r) exp (— sur) de (19) 


1 2 -i 
=— 1-77 lil Sh. (20) 
This spectrum is centered on the carrier frequency and is zero outside 
the limits +f,, on either side of the carrier, where 
V 
=— 21 
in == (21) 
is the maximum Doppler frequency shift. 
Gilbert? has shown that the corresponding baseband output spec- 
trum from a perfect square-law detector is given by the complete 
elliptic integral, 


Suva) = cae KUL — 6/240)". (22) 


This output spectrum can be obtained either from the self-convolution 
of the input spectrum of equation (20) or by taking the Fourier trans- 
form of equation (15) expressed as a function of + by means of 
equation (17). The spectrum of equation (22) also describes to good 
approximation the baseband output spectrum from an envelope de- 
tector (that is, half-wave linear rectifier). Thus, 


Surea(f) & <a KUL — G/24n) "1. (23) 


wf 
This is a consequence of the approximate equality of the spatial 
autocovariance functions of equations (15) and (16). 

Figure 5 shows input and baseband output spectra for the above 
case of a vertical monopole antenna in an isotropically scattered 
field. The sharp cutoff in the baseband spectrum at twice the maxi- 
mum Doppler shift is observed to some extent in all measured mobile- 
radio spectra.*: § A small amount of spectral content will occur beyond 
this cutoff in the case of an envelope detector because of the higher 
order terms neglected in the analysis, and in all cases because of the 
finite length of the time series used to compute the spectra. Again, 
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Fig. 5—Input and baseband output spectra for a vertical monopole antenna 
in an isotropically scattered field. 


in all cases the spectral content at the very low frequency end of the 
spectrum is much higher than that predicted by theory, owing to 
the nonstationary character of mobile-radio fields (see Section VI). 

But in some cases, such as the spectrum obtained by Rustako,® there 
is reasonably good agreement between the general shape of the spec- 
trum observed and that shown in Fig. 5b. Section IV shows that the 
theoretical spectra are different, except for the occurrence of the cutoff, 
if there is a significant directly transmitted component wave in addi- 
tion to the scattered component waves. Most of the observed spectra 
seem to be of this latter type. 

The above method of deriving spectra, by way of the Fourier 
transform of the autocovariance function, is not ideal. In all but the 
simplest cases (for example, when p(a) is uniform), direct integration 
of equation (7) is often impossibly difficult. As an alternative, the 
direct method (described in the next section) which depends on asso- 
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ciating a Doppler shift with the direction of arrival of each com- 
ponent wave, is much simpler to apply and allows one to retain a 
clear picture of the underlying physical processes. 


IV. SIGNAL SPECTRUM AND ANGULAR PROBABILITY 


There is a simple direct relationship between the signal spectrum 
at the mobile receiver’s antenna terminals and the product g(a)p(a). 
This is the product of the antenna’s azimuthal power gain function 
g(a) and the probability density function p(«), the arrival angles of 
the plane waves which comprise the field incident on the antenna. 
Let us look at the use of the relationship for an omnidirectional an- 
tenna, the antenna assembly for the Pierce energy density scheme, 
and an azimuthally directive antenna. 


4.1 The General Relation 

The theoretical model proposed in Section 1.1 describes the field 
incident on the mobile receiving antenna in terms of a random set of 
vertically polarized plane waves incident horizontally which occur 
with probability density p(«), where « is the azimuth angle. Then, 
because of the vehicle’s movement, each angle a (see Fig. 6) will be 
associated with a Doppler shift f in frequency from the carrier fre- 
quency, such that 


f =f, COS a 


where 


ve 
r 


In = (21) 


is the maximum Doppler shift at the vehicle speed V and carrier 


wavelength X. 
DIRECTION OF A 
TYPICAL COMPONENT WAVE 


ris a 
DIRECTION OF VEHICLE, 
[¢}-+-------+- 


SPEED V 


‘S ANTENNA 


Fig. 6— Relative directions of the mobile vehicle and a typical component 
plane wave. 
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The spectrum of the signal v at the terminals of the receiving antenna 
on the mobile vehicle will consist of a set of spectral lines which will 
occur at random in the range +f, about the carrier frequency f, . The 
probability that one of these spectral lines will occur in the range from 
f to f + df is given by the probability density function p,(f), which may 
be obtained (see p. 33 of Ref. 3) from the probability density function 
p(a) by equating the differential probabilities 


pif) ldfl = {p(+e) + p(—a@)} (dal (24) 


since +a and —a give the same Doppler shift. Then, from equation 
(23), 





ae ee TE 
. {p(a) [ie eos—*(S/Sm) p(a) ee cos™*(S/fm) } . (25) 


The signal spectrum S,(f), the average energy of the signal v in the 
frequency range f to f + df, is given by pi(f) weighted by the power 
gain g(a) of the antenna in the corresponding azimuthal direction a. 
Thus 


1 
ie VJ/1 — f rit 
: (p(a)g (a) lye oce tial p(a) g(a) ts cos~*(S/Sm) } (26) 


which is the desired general relation. (See Appendix C for a formal 
proof.) 


S.(f) = 


4.2 Application of the General Relation 


4.2.1 Omnidirectional Antennas 


The practical case of most frequent interest is that of a vertical 
monopole antenna, which has a constant azimuthal gain function, 
say g(«) = 1. Assuming that p(e) is uniform for all angles through- 
out the range —z to +7, p(a) = (27) and the signal spectrum at 
the antenna terminals would be 





Sf) = 27 
(f) = aS aE (27) 


for frequency shifts in the range +f, about the carrier frequency f, , 
and would be zero outside that range. The spectrum of equation (27) 
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is identical to that of equation (20) which is for the electric field under 
the same circumstances, an identity that was to have been expected. 
The spectral shape of equation (27) is therefore that of Fig. 5a. The 
corresponding receiver baseband output spectrum, assuming square-law 
detection, would be that of Fig. 5b. 

The baseband output spectrum is considerably different if, in addi- 
tion to the uniformly scattered set of waves, there is a significant 
wave transmitted directly from the transmitter to the receiver. If 
the angle of arrival of the direct wave is a, the spectrum of the signal 
at the terminals of an omnidirectional antenna would be that shown 
in Fig. 7a. This is the basic scattered spectrum of equation (27) 
together with a spectral line displaced from the carrier frequency 
by finCOSs ay. ; 

The corresponding output spectrum from a receiver with a square- 
law detector (or to good approximation if the detector is half-wave 
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(b) OUTPUT SPECTRUM 


Fig. 7— Input and baseband output spectra for signals from an omnidirec- 
tional antenna, when a uniformly scattered field plus a direct wave are incident. 
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linear) may be obtained by convolving the above input spectrum 
with itself. (See p. 255 of Ref. 3.) This yields a baseband output 
spectrum of the form in Figure 7b, in which a, was chosen to be 60 
degrees. In general, the high-frequency part of the baseband spectrum 
ends in a shelf which cuts off at twice the maximum Doppler fre- 
quency shift. (In the case of the half-wave linear detector there is a 
small amount of energy at frequencies beyond the cutoff frequency.) 

There are two peaks in the baseband output spectrum which occur at 
f = fm(1 -+& cos a). Such peaks, as well as the final shelf, are clearly in 
evidence in Ossanna’s experimental spectra.’ Figure 8 shows two more 
experimental spectra, one where the direction to the transmitter was at 
right angles to the path of the receiving vehicle, and the other where 
the transmitter was directly ahead. The dashed curves are theoretical 
spectra with the ratio of power in the direct wave to the total scattered 
power adjusted arbitrarily. The theory apparently gives the basic form 
of the experimental spectra, but there are differences in detail. 

Of course, complete agreement of theory and experiment is not 
to be expected. Apart from obvious changes, such as the speed of 
the vehicle and its inclination to the transmitter direction, the p(a) 
for the scattered waves and the magnitude of the direct wave will 
change throughout the entire data run. This means that the time 
series constituted by the output voltage of the receiver is not a sta- 
tionary process, whereas the spectra are deduced on the assumption 
that it is. Methods of approaching this problem of the nonstationarity 
of mobile-radio data are discussed in Section VI, and methods of mak- 
ing a more valid comparison of theory and experiment are suggested. 


4.2.2 Vertical Loop Antennas 


As a simple example of an azimuthally directional antenna, the 
vertical loop is interesting because it forms part of the Pierce “total 
field” antenna system. (See Ref. 2, pp. 14 and 15, where this arrange- 
ment of a vertical monopole, together with two orthogonal vertical 
loops, is discussed in terms of the vertical component of the electric 
field and the two horizontal components of the magnitude field.) 

Assume that the plane of loop 1 (see Fig. 9) lies in the direction 
of travel and that the plane of loop 2 lies perpendicular to that di- 
rection. Then the azimuthal power gain functions for the two orthog- 
onal loops will be of the form 


nla) = cos’ a (28) 


MOBILE RADIO 975 


DECIBELS 














2 3 4 5 6 8 10 20 30 40 5060 80 100 
BASEBAND FREQUENCY IN HERTZ 


Fig. 8— Comparison of theoretical (broken line) and experimental baseband 
output spectra with transmitter (a) at right angles to, and (b) directly ahead 
of, the vehicle path. 


and 
g2(a) = sin” a, (29) 


respectively. 

If it is further assumed that the scattered waves are uniformly dis- 
tributed in angle, that is, p(a) = (27), and that there is no sig- 
nificant direct wave. Then, using the general relation of equation 
(26), the spectra of the signals at the terminals of the two loop 
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Fig. 9— Plan view of Pierce antenna system, consisting of a vertical monopole 
and two orthogonal vertical loops. 


antennas will be 








f/f)” | 
S, _ ——— 30 
and 
sf) = Mia tls = AES (31) 


Figure 10 shows these spectra with their corresponding baseband 
output spectra, assuming square-law detection in the receiver. 

The spectra of equations (30) and (31) could also have been ob- 
tained from the autocovariance functions of equations (10) and (11) 
by substituting equation (17) and taking their Fourier transforms. 
However, the general relation is much simpler to use and indeed is 
the only reasonable method to use in cases where p(a) and g(a) are 
other than of the simplest functional form. In addition, the general 
relation preserves the physical description of the problem. Thus the 
shapes of the spectra in Fig. 10a have a straightforward explanation 
in terms of the antenna patterns emphasizing the Doppler shifts 
resulting from waves arriving from some directions and deemphasizing 
others—which is precisely the meaning of the general relation of 
equation (26). 


4.2.3 Beam Antennas 


The general relation of equation (26) gives a simple and direct 
solution for a beam antenna. The use of such highly directive antennas 
in mobile radio was suggested by W. C. Jakes’? with a view to reduc- 
ing the spectral width, and hence the rate of fading, of the received 
signal. The general relation shows immediately that such a reduction 
in spectral width does indeed occur, and gives the precise nature of 
that reduction. 


MOBILE RADIO 977 


Consider the idealized beam antenna pattern shown in Fig. 11. The 
power gain function g(a) in this case can be considered to be unity 
over the beamwidth 6 and zero in all other directions. If it is again 
assumed that the scattered waves are uniformly distributed in angle 
and that there is no significant direct wave, the effect of the antenna 
pattern on the spectrum of the signal at the antenna terminals can 
be thought of in terms of the pattern being a sectoral slice of a ficti- 
tious omnidirectional pattern. Hence the spectrum for the beam an- 
tenna is a slice taken from the spectrum for an omnidirectional 
pattern. See equation (27) and Fig. da. 

When the beam antenna is directed broadside to the direction of 
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(b) OUTPUT SPECTRUM 


Fig. 10— Receiver input and baseband output spectra for the two orthogonal 
loop antennas of Fig. 9. 
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Fig. 11— Receiver input spectra for an idealized beam antenna used in a 
uniformly scattered field. (a) Beam antenna pattern. (b) Spectrum for antenna 
directed broadside. (c) Spectrum for antenna directed straight ahead. 


vehicle travel, the spectrum of the signal at the antenna terminals 
will be that shown in Fig. 11b, where the dashed curve shows the 
“remainder” of the omnidirectional spectrum. The spectrum is almost 
flat and is 2f,,sin(8/2) wide. 

When the beam antenna is pointed straight ahead, along the direc- 
tion of vehicle travel, the spectrum is that shown in Fig. 11c. Instead 
of being centered on the carrier frequency, as in the broadside case, 
the spectrum occurs at the extreme right of the omnidirectional spec- 
trum, and is f[1 — cos(8/2) ] wide. 

Thus it is apparent that the use of highly directive antennas in 
mobile radio will lead to a reduction in spectral width. W. C.-Y. Lee 
has confirmed this experimentally, using an array antenna at 836 
MHz in a suburban environment.’* Lce derived from the measured 
data the rate of crossing of the signal at a certain level and plotted 
this against antenna beamwidth. Rice has shown that for a narrow- 
band random signal which has a symmetrical spectrum about the 
carrier frequency, the rate of signal crossing at a certain level is 
just the probability density at that level multiplied by the square 
root of the second moment of the spectrum about the carrier fre- 
quency.‘ In this way the level crossing rate at a particular level is 
a measure of the width of the spectrum of the fading signal. 

The sectoral beam pattern assumed in the early part of this section 
never occurs in practice. It is worth emphasizing this rather obvious 
point in connection with calculating spectral second moments. Because, 
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even though the antenna sidelobe level might be uniformly low, there 
will be spectral content throughout the entire range of f, + f, . Also, 
the basic omnidirectional spectrum emphasizes the contributions at 
the extremes of this range. Hence calculations of the spectral second 
moments might well be in error if they are based on the assumption 
that the side-lobe level is zero. 


V. CORRELATION BETWEEN SIGNALS OF DIFFERENT FREQUENCIES 


The problem of correlating two signals of slightly different fre- 
quencies occurs in mobile radio when questions of maximum usable 
bandwidth, or the use of a pilot signal at a frequency other than 
the carrier frequency, arise. Let us show that the covariance of two 
signals as a function of their frequency separation is simply the 
characteristic function of the probability density function of the 
time delays suffered by the component plane waves which are as- 
sumed to compose the mobile radio field. 


5.1 Theory 


Suppose that the transmitted signal contains two unmodulated 
signals of frequencies w, and w, , whose difference Aw = w, — w, is small 
enough not to violate the following assumptions. Assume that the two 
signals take exactly the same time to travel from transmitter to mobile 
receiver along any one of the scattering paths assumed in the model in 
Section I. This assumption implies that propagation along all paths is 
by way of freespace type waves (which do not suffer dispersion), and 
that any phase changes experienced at reflecting or diffracting objects 
are independent of frequency. Associate a time of travel t, with the 
nt» component wave, and define a time delay At in comparison with 
the shortest possible time of travel ¢, such that 


AES Sb (82) 


To preserve the assumption made in all previous sections that the 
phases of the component waves are random and equally probable 
throughout 0 to 27 it is necessary that the average magnitude of the 
time delay difference between the n and m** waves, assumed to be 
independent, be 


(ite ta ew Lf, (33) 


where f, is a frequency in the neighborhood of f; and fe. 
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The electric fields at the two frequencies may be written as 
ss 
HK, = Ey, > exp {jw,(¢ — t,)} 
n=1 


N 
Ey = Boo >> exp {jor(t — t,)} 
n=1 


where Ho; is the amplitude at frequency f; of all the waves, and 
similarly Eo, is the common amplitude at fo. Forming the complex 
product 


N N 
LA, = Eek E os exp {j(we _ w,)e} » ys exp (Jel, aS wl) } 


n=1 m=1 


and taking the expectation of both sides, 


(E* Es) aw = LE, exp {J@2 — w,)t} 2 (exp { —jlw2 — o) pre (34) 


since it has been assumed that the time delays are independent, and 
therefore that 


(exp {—j(Woalm — @it,)})ay = 0 for mAn 
as a consequence of inequality (83). The covariance of the two fields 
as a function of their frequency separation Aw is therefore 
R,(Aw) = (EA Ee) av 
= NEE 2 exp {j Awt} exp {—j Awt,}(exp {—j Aw At}),. 
where the subscript n has been dropped on At, because the average is 
the same for any n. The normalized magnitude of R,,(Aw) is: 
| Pi2(Aw) | = (exp {—j Aw At}),, (35) 
is simply the characteristic function, with negative argument, of the 
probability density function for the time delays At. (See Ref. 3, p. 50.) 
As an example, suppose that the time delays are exponentially 
distributed, so that the probability density function of At is 
1 At 
p(At) = 7p OXP | for OS Ats +o (86) 
where 7 is a measure of the spread of the time delays. Then the 


normalized magnitude of the covariance function in equation (35) 
becomes 


| prx(Aw) | = [2 + (AeT)"T?, (37) 
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which is shown in Fig. 12. It is apparent that the correlation falls off 
significantly for frequeney separations Aw > 1/7, the inverse of the 
measure of the spread in time delays. 


5.2 Weperiment 


Aside from its mathematical convenience, the exponential distribu- 
tion of time delays seems physically plausible on the grounds that 
the shorter delays appear more likely to occur than the longer delays. 
Indeed, the pulse observations made by Young and Lacy at a fre- 
quency of 450 MHz in New York City support this contention.*® 

Ossanna has computed the envelope correlations from measure- 
ments at 860 MHz in a suburban environment for two-carrier fre- 
quency separations of 0.1, 0.5, 1.0, and 2.0 MHz.'® The corresponding 
covariances are shown as circles in Fig. 12, where it has been as- 
sumed that 7’ = 1/4 psec. A comparison of these experimental points 
with the theoretical curve indicates that an exponential distribution 
of time delays is a reasonably good assumption, and that in the 
suburban environment where the experiments were performed the 
time-delay spread T is about 1/4 psec. 

In contrast, Young and Lacy’s pulse measurements indicate a 
time-delay spread about 5 psec, but with an approximately exponen- 
tial distribution. The reasons for the difference in time-delay spreads 
appears to result from the different environments in which the ex- 
periments were performed, not to the different frequencies, because 
their difference is not great. Thus in a suburban environment the 
component waves are likely to have been redirected by objects within 
a few hundred feet of the mobile receiver, whereas in New York City 


1.00 p= 
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Fig. 12 — Normalized covariance of two signals as a function of their frequency 
separation, assuming an exponential distribution of time delays with delay 
spread 7’. The circles are Ossanna’s experimental points. 
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the range of these objects can reasonably be put at many thousands 
of feet. 


5.3 Significance of the Random Time Delays 


The immediate benefit of knowing the probability distribution of 
the time delays of the component waves is that it enables one to 
deduce the “coherent bandwidth” for that particular system. But 
the significance of the time delays is much more than this, in that 
it emerges as a basic characteristic of the system along with the 
probability distribution of the angles of arrival of the component 
plane waves. 

Indeed, it would appear that a knowledge of the joint distribution 
p(a,At) of the angles of arrival « and the delay times At provides 
an almost complete description of the mobile radio field; hence, of 
the mobile radio signals sensed by antennas moving through this field. 

Thus, integration of the joint distribution with respect to a yields 
the distribution of time delays. Then if the standard deviation of 
the time delays is large compared with a period of the carrier fre- 
quency, the component waves may be said to be completely ran- 
domly phased and their phases and angles of arrival to be inde- 
pendent. The results obtained in Sections IJ, III, and IV would then 
follow, because they are based solely on the knowledge of p(a) and the 
assumptions that the phase is completely random and independent 
of the angle of arrival. 

An interesting sidelight is that the cross-covariance of two signals 
of different frequencies, one shifted in time by 7+ from the other, 
depends on the joint distribution p(a, At). The Fourier transform 
of this cross-covariance yields the cross-spectrum of the two fre- 
quency-separated signals. 

It is tempting to assume that a and At are independent, thus mak- 
ing the calculation much simpler. But this does not yield answers 
that accord with experiment; so one must conclude that « and At 
are not independent. This also seems a reasonable conclusion on 
physical grounds, since it is likely that the shortest time delays will 
be associated with angles of arrival from the general area of the 
transmitter, and that the longest delays will be associated with the 
opposite direction. 


VI. THE NONSTATIONARY CHARACTER OF MOBILE RADIO SIGNALS 


A perennial difficulty in the analysis of mobile-radio data is its 
nonstationary character. This makes both the analysis arbitrary and 
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its interpretation uncertain. This section attempts to meet this dif- 
ficulty directly, rather than trying to find sections of data that “look” 
stationary or attempting to “doctor” data to that same end before 
it is analyzed. 

The data chosen for analysis were those obtained by Rustako on 
a single omnidirectional antenna at 836 MHz along Sherwood Drive, 
a suburban street approximately 2 miles from the transmitter and 
running at an angle of about 48° to the transmitter direction.*? The 
choice of data was made on the grounds that Rustako’s computed 
output spectra most closely resembled the shape of the theoretical 
output spectrum of Fig. 5b which is for a completely scattered field 
with no significant directly transmitted component. 

Two tests were performed on the data, one to determine the proba- 
bility distribution of the envelope and the other to determine its 
time correlation by using Kolmogorov’s structure function. 


6.1 The Probability Distribution 


6.1.1 Theory for a Stationary Process 


According to the theory of Section 2.1, if the field incident on the 
mobile receiver is of the scattered type, each component wave being 
independent and randomly phased, then the probability density func- 
tion (p.d.f.) of the envelope R is Rayleigh, that is, 


2 


p(k) = =n exp ee for OSRS+0 (38) 


which has the corresponding cumulative distribution function 


2 
P@) = P p(k) dR = 1 — exp (Eh. (39) 
0 
This distribution has a root-mean-square value 
WV. R? =o (40) 
a mean value 
(R)as = roe o = 0.8860 (41) 
and a most probable value (or “mode”) 
Pig 3 S00 (42) 
V2 
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A convenient method of testing whether or not a given set of statistical 
data follow an assumed distribution is as follows.’’ First the histogram 
of the data (that is, relative frequency diagram), which is the practical 
approximation to the probability density function, is obtained. This 
is then summed point by point to give the cumulative frequency dia- 
eram, which is the practical approximation or estimate P(R) of the 
cumulative distribution function P(R). Then P(R) is plotted against 
P(R). If the two are identical for all R, then the resulting plot will be 
a straight line from (0, 0) to (1, 1). If not, the departure of the plot 
from the straight line is a measure of the departure of P(R) from P(R). 

In analyzing Rustako’s data the question to be answered was how 
closely the data followed a Rayleigh distribution. The appropriate 
P(R) is then that of equation (39); and the value of o can be ob- 
tained from the maximum of the histogram with the aid of equation 
(42). The above arguments assume that the data is a stationary 
process. 


6.1.2 Theory for a Nonstationary Process 


If the theory of Section 2.1 is modified slightly to take account 
of the undoubted fact that either the number or the magnitude of 
the component waves will vary as the vehicle moves along its path 
by normalizing to the local mean, and if the assumption that the 
field is completely scattered is retained, then the expected distribution 
of the envelope will again be Rayleigh. However, the root-mean- 
square value o will no longer be a constant, but will vary with time 
in some manner o(t). The envelope can now be classed as a non- 
stationary Rayleigh process. . 

It is possible to estimate o(t) from the record by computing the 
“local” mean (R),y(t); then from equation (41) 


(R)av(t) = 0.886a(C). (43) 
Hence, writing the new random variable 
k 0.8868 





Oa AB) — 


which in effect has a root-mean-square value of unity. The r process 
will be a stationary Rayleigh process with a p.d.f. 


p(r) = 2r exp {-7°}. 


Equations (43) and (44), in effect, remove the nonstationary effects 
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from the statistics. The meaning of “local” is explained further in the 
next section. 


6.1.3 Analysis 


Rustako’s data, which had been converted to digital form at 500 
samples per second, was taken in sets of 4000 points at a time. Notice 
that such a length of data contains approximately 200 fading cycles. 

Each set was analyzed, first of all, on the assumption that it was 
stationary, by the method outlined in Section 6.1.1. To obtain the 
histogram, the amplitude range between the lowest and the highest 
value was divided into 50 equal slices. The P(R) versus P(R) plots 
for three sets of data are shown on the left side of Fig. 13. Each point 
corresponding to a partiuclar slice level. The three sets of data were 
chosen to illustrate where P(R) is always greater than P(R), where 
P(R) is always less than P(R), and where they are approximately equal. 
On the assumption that all three sets of data are stationary it would 
have to be said that the first two cases are donne non-Rayleigh 
while the third case is. 

Next, the same sets of data were normalized by the method outlined 
in Section 6.1.2. The local mean for every point was obtained by averag- 
ing the 200 points symmetrically adjacent to that point. The resulting 
normalized random variable was then treated in exactly the same way 
as the unnormalized random variable. The right side of Fig. 13 shows 
plots of P(r) versus P(r). It can be seen that in the first two cases the 
normalized random variable is much more closely Rayleigh distributed 
than is the unnormalized random variable. The third case is interesting 
because, although the normalization was not necessary to reduce the 
data to a stationary Rayleigh process, it demonstrates that the tech- 
nique of normalization itself does not significantly impair the original 
process. 

In conclusion, it can be said that the technique of normalizing 
a nonstationary Rayleigh process by way of its running mean can 
be used to determine whether or not the process is in fact Rayleigh. 
But it must be emphasized that the technique cannot be applied to 
processes that are non-Rayleigh. It is certainly possible, however, 
that different techniques along these same lines might apply to dif- 
ferent processes, although it would appear that some knowledge of 
the expected distribution is essential. The Rayleigh process is one of 
the simplest to handle because it is determined by a single parameter. 
In the example used here the Rayleigh process was clearly indicated 
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Fig. 18 — Plots of P(R) versus P(R). (a) For the raw data. (b) For the same 
data normalized by its running mean. 
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by the theory, and the analysis amounts to a positive confirmation 
of its applicability. 


6.2 Using Kolmogorov’s Structure Function 


Tartarski® has described the value of using a “structure function” 
in specifying random variables which are not statistically stationary. 
(The technique was first used by Kolmogorov to describe meteroro- 
logical quantities.) The structure function might be of value in 
analyzing nonstationary mobile radio data. 


6.2.1 Definition and Properties 
The simplest type of structure function, D;(r) of the real random 
variable f(t), is defined by 
Dr) = Uf + 7) — 10) ee (45) 
where the angular parentheses denote a time average. This should be 
compared with the more commonly used autocovariance function, 
defined for a stationary random variable whose mean is zero by 
R(t) = (f(t + 7){O)ae « (46) 
Thus the structure function for a stationary random variable which 
can be written in terms of the autocovariance function is 
D,(r) = 2[R,(0) — R,(7)). (47) 
As an example, the structure function for a stationary random variable 
with a Gaussian autocovariance function, exp {—7°/72} in which 7» is 


constant, is depicted by the solid line in Fig. 14. The equation of this 
solid line is 


D,(r) = 2[1 — exp {—7°/73}}. 


TIME,7 —> 


Fig. 14—Structure functions for stationary (solid line) and nonstationary 
(dashed line) random variables. 
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Now, if the random variable is nonstationary in that it has, say, 
a slowly varying mean value, then the structure function would be 
modified in some way such as that shown dashed in Fig. 14. This 
dashed portion would very likely be indeterminate, so that the cor- 
responding autocovariance function would be indeterminate for all 7. 
Hence the value of working, at least initially, with the structure 
function: if the random variable is stationary, that will immediately 
be apparent in that D;(r) will approach a horizontal asymptote for 
large 7; and if it is nonstationary, the portion for small 7 can be 
relied on. 

The dashed portion of Fig. 14 can be shown to correspond to an 
increase in low-frequency spectral energy compared with the station- 
ary case.'® 


6.2.2 A Structure Function Computed from the Data 


The solid line in Fig, 15 shows the structure function for Rustako’s 
Sherwood Drive data, computed from the definition of equation (45). 
The data, again consisting of 4000 points, roughly straddled that 
which gave the first two probability plots of Fig. 13. The structure 
function is shown out to a time separation r+ of 50 data points, or 
100 msecs. 
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Fig. 15—Structure function computed for Rustako’s data (solid line). The 
dashed line is the theoretical structure function for a stationary random variable. 
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The dashed curve is a theoretical structure function for an assumed 
stationary process with an auto-covariance function of the form 
J3(2af,.7), where f,, is the maximum Doppler shift. This autocovariance 
function, which is derived from equations (16) and (17), is for the 
departure of the signal envelope from its mean value for the case of 
an omnidirectional antenna in a uniformly scattered field. The theoret- 
ical and experimental structure functions were arbitrarily made equal 
at the first maximum. 

The experimental structure function, which is typical of many that 
were obtained, exhibits some of the features that were expected. The 
initial part of the curve, for small 7, closely follows the theoretical 
curve, and the quasiperiodic nature of the curve for large 7 is also 
evident. In this region the experimental curve rises systematically 
above the theoretical curve, as was to be expected for nonstationary 
data. 

This upward trend of the experimental structure function for large 
r corresponds to the repeated observation of baseband low-frequency 
content at a significantly higher level than the theory predicts. 

If this large-scale trend in the structure function were removed, 
then the modified structure function should agree with the theoretical 
structure function, provided that the basic assumptions of the theory 
are sound. The curves do differ, both in the amplitude and the period 
of the quasi-periodic variation. However, this might well result from 
the wrong choice of p(«), and not to a basic flaw in the theory. 

It is evident that the structure function does afford a method of 
analyzing nonstationary data. The effect of large-scale variations 
shows up in the structure function and can be removed at that point, 
rather than by tampering in an arbitrary manner with the original 
data. Then the modified structure function can be compared with 
theoretical forms which are appropriate to stationary data. : 


VII. CONCLUSIONS 


The theory presented in this paper attempts to explain the statis- 
tical behavior of fields and signals encountered in mobile radio in 
terms of a set of independent plane waves, redirected by scattering 
and reflecting obstacles, and incident horizontally on the mobile 
receiving vehicle. These waves can be described statistically by the 
joint probability density function p(a, At) such that the probability 
of a wave arriving at the azimuthal angle a with a time delay At is 
pla, At)dad (At). 
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At ultrahigh frequencies and above, in urban and suburban en- 
vironments, the spread in the magnitudes of the time delays is suf- 
ficiently large, compared with the radio-frequency period for the 
waves, to be considered randomly phased, in which case the follow- 
ing conclusions apply. 

The field components are Gaussian, in the sense that their real 
and imaginary parts are independent zero-mean Gaussian random 
variables of equal variance. Thus the envelope of a signal derived 
from such a field by an antenna will be Rayleigh distributed, unless 
there is a significant nonscattered wave arriving directly from the 
transmitter, in which case the envelope will be Rice distributed. 

The spatial correlation of the field components may be derived 
from the probability density function p(a). The spectrum of the 
signal at the antenna terminals may be derived from the product of 
p(x) with g(a), the azimuthal gain function of the antenna. The 
coherence of two radio frequencies, as a function of their frequency 
separation, may be derived from the probability density function of 
the time delays p (At). 

A brief examination of available experiments reveals that simple 
forms of both p(a) and p(At) give theoretical results which agree 
broadly with experiment. We do not claim detailed agreement, nor 
does this seem possible until more complete experimental information 
is available. It does appear, however, that it is essential to take ac- 
count of the nonstationary character of the signals obtained in mobile 
radio when attempting such a comparison. 

The theoretical approach we have taken is midway between a 
purely phenomenological one, based on a complete catalog of the 
statistical characteristics of mobile-radio signals received under a 
variety of circumstances, and a purely analytical one in which the 
transmission environment is specified in detail. The phenomenological 
approach would be incomplete, in that it would not provide knowledge 
of why the signals have the character observed. The analytical ap- 
proach is impossibly difficult to execute. Our approach, which seeks 
to describe the mobile-radio fields in terms of the compact (though 
not necessarily simple) quantity p(a, At), does provide the system 
designer with information which he can use to advantage in a straight- 
forward way. The following is an example to illustrate this claim. 

For example, suppose that experiments in a particular environ- 
ment have shown that p(a) is roughly uniform and that p(At) is 
approximately exponential with parameter 7 such that JT is very 
large compared with the period of the proposed carrier frequency of 
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the mobile radio system. Then it is known that if an antenna with 
uniform gain in azimuth is used on the receiving vehicle the received 
signal will be a Rayleigh distributed fluctuating quantity with a 
baseband spectrum approximately uniform out to a frequency 2V/), 
where V is the vehicle speed and A is the carrier wavelength. 

This system can be improved in a number of ways. The depth of 
fading, as Rustako has demonstrated,®? can be reduced by using a 
number of such antennas separated by a sufficient distance for the 
signals to be essentially uncorrelated. The signals are then brought 
to a common phase, at which point they are combined before detec- 
tion. The resulting signal is therefore the sum of a number of in- 
dependent, Rayleigh distributed amplitudes, which for a large num- 
ber will approach a Gaussian distribution with a nonzero mean. 

Furthermore, the ratio of the root-mean-square fluctuation to the 
mean of the combined signal will decrease as the square root of the 
number of signals combined (by an approximate application of the 
Central Limit Theorem). Alternatively, the rate of fading, as Lee has 
demonstrated,1* can be reduced by using directional antennas, which 
give a reduced spectral width of the fading* and hence a reduction 
in its rate. 

W. C. Jakes has suggested a system, particularly suited for use at 
microwave frequencies, which combines the advantages of both a 
reduced depth and a reduced rate of fading.!° The system consists of 
a number of directive antennas mounted on a single mobile unit and 
pointing in different azimuthal directions. If the signals from the dif- 
ferent antennas are brought to a common phase and then combined 
before detection, the resulting signal will not only be considerably 
reduced in bandwidth compared with the case if an omnidirectional 
antenna had been used, but its depth of fading will also be reduced 
according to the square root of the number of antennas used. The 
widest coherent bandwidth that can be transmitted in the situation 
assumed is about 7. 
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APPENDIX A 


On the Correlation of the Real and Imaginary Parts 
of the Field Components 


It is important to know the precise conditions under which the six 
real random variables comprising the real and imaginary parts of 
the three field components of equations (1), (2), and (3) are un- 
correlated. Thus 


N N 
E, = Ey > cos¢, + jo Di sing, 
n=1 n=1 
Ey. By A : 
H, = —— dSsina, cos¢, — j— Dd sina, sin ¢, 
N n=1 Q n=1 


Ties ie ee : 
H, = — > cosa, cosy, + 7-2 >> cosa, sin g, . 
n=1 Qo n=1 


Denoting the real and imaginary parts of each field component by 
the superscripts (r) and (2), the correlation coefficient of the real 
and imaginary parts of the electric field, is 


N N 
(EMP ES).. = ES dy Do (cos gn Sin gn)av = 0 
n=1 m=1 
since the ,’s are independent and rectangularly distributed through- 
out 0 to 2z. 
Similarly, 


r a E ~ ~ j i i 
(HP H®)., = =2 > YS Gin an Sin am COS Gp SIN On)av = 0 


Q n=1 m=1 


on 


and 


on 


N N 
(HP HO) = = D> D> (cos ap COS Am COS Gn SIN On)av = 0 
n=1 m=1 
with the additional assumption that the ¢,’s and a,’s are statistically 
independent. It can also be shown, based on the foregoing assump- 
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tions, that the correlation coefficient for any component real part 
and any component imaginary part is zero. 

Notice that the above correlation coefficients are zero whatever 
the probability density function p(a) is of the a,’s. Where p(a) is 
important is in the correlation coefficients for the component real 
parts with each other and for the component imaginary parts with 
each other. For example, 


2 N N 


BH Yup = “2 YZ (Gin a €05 05 na 


Q n=1 m=1 
is zero if the further assumption is made that p(a) is rectangular 
throughout —z to +z. Then the correlation coefficient is zero for any 
pair of component real parts and for any pair of component Imaginary 
parts. 


APPENDIX B 
Correlation of Fields—Their Magnitudes and Squared Magmtudes 
Section 2.1 and Appendix A show that under certain conditions the 
fields in mobile radio are “Gaussian fields,” which means that a 
typical field component F (either an electric or magnetic component) 
may be represented by 
Pout jy 


where x and y are real, independent, zero-mean Gaussian random 
variables of equal variance. Thus 


(a) av = Wav = 0 
(2")av = (Yaw = 
and since both x and y are Gaussian distributed, their independence 
is implied by 
(LY) av = 0. 


The theory in the main text is concerned with finding the covariance 
(F*F,),, of two such Gaussian fields, where F, and F, may be two field 
components separated in space, in time, in frequency, or in all three. 
Thus 


By 
PF, 


Xi + JY. 


I 


Lp + jYe 
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and 


Ry = (FAP s)w = Bikoay ae (Y1Y2)av ig J ((21Y2) av mal (X2Y1)av) + 
If, as is most often the case, all real parts are uncorrelated with all 
imaginary parts, 
(UiYo)av = (L2Yidav = O 
and 
Rr = (F*F,) .y = (FF). = (Gite se + (YiYo)av (48) 
is wholly real. 

In practice it is not possible to measure the correlation of the 
complex fields. But what can be measured is the correlation of their 
magnitudes (that is, envelopes) 

A=|Fl=V#Fe 
and the correlation of their squared magnitudes (that is, energies) 
A=|FP = FRR = 2? +7’. 
The relation between the autocovariance functions Ry, R,, and Ry, 
is as follows. 
Consider first the autocovariance function for squared magnitude 
Ras = (| FP | Fe Pye = (PPAF FS) av 


= (alate + (yiyader + (wive)ae + (22yi dar « 
To evaluate the right-hand side one may use the result that if 71,..., 
x4 are real, zero-mean Gaussian random variables (see Ref. 3, p. 168), 
(€1XoUslsyav = (UiXe)avUsa)av + (r%s)av(LoVa)av + (11%4)avToUs)av « 


Then, typically, 


(0555 ak == (2101Xo%Xe) aw = o =F 2((a120)av)” 
and . 


2 4 


Cy; av = (t1L1Y1Y1) aw 8: 
so that 
Ry: = 40° + 2[((arrde)av)” = ((YYo)av) I. (49) 


Now, in most cases 


(X12) av = (Y1Yo)av . (50) 
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For example (48) and (50) can be shown to follow if F; and F, are the 
same field component, but do not hold if F, is EZ, and F» is H;. : 
Then equations (48) and (49) combined give 


Kas = 4o* + Re (51) 
or from equations (48) and (50) 


Ra: = 40°(1 + p°), (52) 
where p is the normalized autocovariance function of the x and y 
random processes. 
The corresponding result for the autocovariance function of the 
magnitudes (see p. 59 of Ref. 13) is 


Ra = (A, Ae) av = (lF,| [Frolov 


= 0 [2E(p) — (1 — p)K(p)), (53) 
where K and E are the complete elliptic integrals of the first and 


second kind. In series form 


Ra =50(1 + p/4 + p'/64 + ---) (54) 


sl 
2 
so that to a good approximation, neglecting powers of p higher than 
the second, 


Ra = ; a'(1 + p’/4), (55) 


which has the same form as equation (52). 
Finally, in terms of the field autocovariance function, 


ise) (50 


Both autocovariance functions Ry. and R, take on a much simpler 
form when normalized in the following way. Define the normalized 
autocovariance function of the departure 5A’ of the squared magnitude 
A? from its mean as 








Ry Eo 


sage = MAL = ADAE = AD ow 
VAAN AAD BO, « | 


Then from equation (52) 








Piar = p's (58) 
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Defining the normalized autocovariance function of the departure 
5A of the magnitude A from its mean in a similar manner, equation 
(54) gives 
psa = aq a (0 + p'/16 + 9'/64 +--+), (59) 
4(4 — x) 
or to a good approximation 


pia & p- (60) 
Equations (48) and (50) show that p is the normalized form of the 
autocovariance function Rr of the complex field component F. 


APPENDIX C 


Derivation of Equation 26 


The complex amplitude of the received signal appearing at the 
antenna terminals may be written in the form 


N 


v = Hy 2) ala,) exp {ie} 


n=0 
where Ey is the common amplitude of the N azimuthal plane waves 
incident on the mobile receiving antenna. The phase of each wave is 
gn, and a(a) is the voltage response at the antenna terminals owing 
to a unit-amplitude plane wave arriving at the azimuthal angle a. 
At another point a distance é away (see Fig. 1) the signal at the 
antenna terminals would be 


N 


v’ = Ey Dd) alan) exp {j(Gn + hE cosan)}. 


m=l1 
Forming the complex product v*v’ and taking its expected value to 
yield the spatial autocovariance function of the two signals, namely 


R,(€) = W*0")av 


= | Ey / > pz (a*(@,)@(Om) EXP {GkE COS Om} )av*(€XP {I(Gm — Gn) })av 


where it has been assumed that the phases and angles of arrival of 
the component waves are independent. Making the further assump- 
tion that the phases are equiprobable throughout the range 0 to 2z, 


RA) = N [BoP [ ple)gla) exp {ik cosa} da (61) 


MOBILE RADIO 997 


where p(a) is the probability density function of the component plane 
waves, and 

g(a) = a*(a)a(a) = |a(a)|’ 
is the azimuthal power gain function of the antenna. 

The temporal autocovariance function of v can be derived from 
equation (61) for a receiver moving with constant velocity V by 
making the substitution € = Vr, where + is a displacement in time. 
Then 


Roe [ . Oren eer re (62) 


where wm, = 2rf, with fm = V/d the maximum Doppler shift, and 
N |E,|? has been set equal to unity. The spectrum of the signal at the 
antenna terminals is given by the Fourier transform of the temporal 
autocovariance function of equation (62) and is 


8.) = | Rls) exp {—j2nfr} dr 7 
63 


= ib dr is decp(a) g(a) exp {jm Cosa — 2rf)7} 


where f = w/2z is the shift in frequency from the carrier frequency. 
Reversing the order of integration in equation (63), the integra- 
tion w.r.t. r yields a Dirac 8-function, thus 


8.) = | pl@dgle) fn cosa — f) de. (64) 
Now writing 
h(a) = fn cos a — f (65) 


it may be noticed that the 5-function of a function may be written 
in the form?® 
d(a = On) 
d[h = at a 66 
Mel = 2 THe) | 


where the a, are all the values of a for which h(a) = 0, and the prime 
denotes differentiation w.r.t. a. Hence, from equations (64), (65), and 
(66) the spectrum of the signal at the antenna terminals is 


1 
In SL — #/fi2 


: {(p(a)g(a) lem cos 1(f/fm) “Fe p(a) g(a) Perera 


SQ) = 


998 THE BELL SYSTEM TECHNICAL JOURNAL, JULY-AUGUST 1968 


which is equation (26). Notice that since the angle of arrival a must 
be real, the frequency shift f must lie in the range +f, . 


APPENDIX D 


Random Frequency Modulation of the Carrier 


Since frequency modulation is often used in mobile radio systems 
it is pertinent to inquire what will be the nature of the received audio 
signal when a single unmodulated frequency is transmitted. The phase 
of the received signal is changing with time in a random manner; 
hence its instantaneous frequency is random. 

It has been shown,”° based on the work of Rice,* that the p.d.f. of 
the time-rate of change of phase &’ (the instantaneous frequency) for 
narrowband Gaussian random noise with an amplitude spectrum 
which is symmetrical about the carrier frequency, is 


p(o") = : E (1 “lF 4 w) | (67) 


where bo and be are the zero and second moments, respectively, 
about the carrier frequency of the amplitude spectrum S(f). Notice 
that it has been assumed that there is no constant sinusoid present 
in the noise. It has also been shown?® that the conditional p.d_.f. 
p(6’\r), which is the density of the instantaneous frequency given 
that the normalized envelope r is a certain value, is 


p(o’ |r) = Ea exp on (68) 


which is a Gaussian distribution with zero mean and standard devia- 


tion 
1 |b 
Pe a fe 
Ge =) Nop, (69) 


The above equations can be applied to the case of a mobile radio 
signal derived from an omnidirectional antenna in a uniformly 
scattered field. 

The appropriate amplitude specturm is that of equation (27) and 
yields the moments, 


b= f ; S(f) df =1 (70) 
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and 


bn = xy | PSU) af = A/2eh (71) 


where w,, is the maximum Doppler frequency shift in radians per second. 
Equations (67) and (68) then become 


12\3 |-} 
p(s’) = | 2oa(1 + 20") | (72) 
and 
PUG aoa t tee Or 
p(e | r) = Mond exp fat (73) 
with 
of = (1/2) i (74) 


The p.d.f. of equation (72) has a rather sharp maximum at 6’ = 0, 
and falls to about 0.2 of this maximum value at 6’ = -+w,,. For large 
instantaneous frequency deviations the p.d.f. behaves asymptotically 
as the inverse cube of the frequency. In practical terms this p.d.f. is 
that of the amplitude of the output of a frequency discriminator in the 
receiver for a single frequency transmitted. 

The conditional p.d.f. of equation (73), which is Gaussian in form, 
can. also be interpreted as the p.d.f. of the amplitude of the discriminator 
output. But this is the p.d.f. of the frequency deviations measured only 
when the envelope amplitude is in the neighborhood of a particular 
level r, which is the envelope normalized by its r.m.s. value. In the 
particular example chosen the envelope has a Rayleigh distribution. 

When 7 = 1 the conditional p.d.f. of the frequency deviations has a 
spread of the order of the maximum Doppler frequency shift w,,. The 
spread will be 10 w,, when r = 75, the probability that r S ~o being 
0.01. Similarly the spread will be 100 w,, when r = xg», the probability 
that r S x39 being 0.0001. Thus the wider ranges of random-frequency 
excursion are associated with only very small fractions of the total time. 
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Some Transmission Characteristics of 
Bell System Toll Connections 


By I. NASELL 


(Manuscript received January 10, 1968) 


A systemwide survey of the transmission performance of built-up toll 
connections was undertaken in 1966. The sampling plan underlying this 
survey ts discussed briefly. The results are presented in terms of distribu- 
tions of background noise levels, 1000 Hz loss, phase jitter, tume to connect, 
and airline distance between end offices. The measurement results are broken 
down by mileage categories. Comparisons are made with the results from 
the 1962 connection survey. It is found that noise performance has 1m- 
proved since 1962 while loss performance is virtually unchanged. 


I. INTRODUCTION 


Many systems engineering studies require detailed knowledge 
about transmission performance and transmission capabilities of the 
Bell System plant. The need for such information exists both for 
specific parts or building blocks of the network and for built-up 
connections between subscribers. A system-wide survey of noise and 
loss on toll connections was undertaken in 1962.1 The results of this 
survey found an important application in the setting of new over-all 
objectives for background noise.? 

A similar survey was undertaken in the summer of 1966. It is our 
purpose to describe this connection survey and to give its results. 
Present transmission performance of built-up toll connections is given 
in terms of distributions of noise, loss, and phase jitter. Furthermore, 
the results include distributions of time to connect, and the distribu- 
tion of airline distances between end offices of toll calls as presently 
established by customers. 

Connection results discussed in this paper describe the toll plant 
contribution to the transmission performance on built-up toll con- 
nections, In considering complete toll connections from subscriber 
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to subscriber, the influence of the loop plant must also be taken into 
account. Some of its characteristics have been described by Hinder- 
liter.® 


II, TARGET POPULATION 


The target population is the population about which information is 
desired. It was defined as the set of all toll calls made in the Bell 
System during the busy period (9 a.m. to 5 p.m.) of an ordinary 
business day. A call was considered a toll call if it satisfied the fol- 
lowing two conditions: (7) the customer received a bill which included 
a separate charge for the call, and (7) the originating and terminat- 
ing central offices did not home on the same toll office. The first 
criterion assures us that the population contains only completed 
messages rather than call attempts, while the second criterion. means 
that with some minor exceptions the toll calls included in the popu- 
lation require at least one intertoll trunk for their completion. 

The main difference between the population defined here and the 
population defined for the 1962 survey lies in the extension from the 
busy hour used in 1962 to the busy period. This extension provides 
for a more satisfactory reflection in the population of the traffic 
patterns generated by telephone subscribers. For example, cross- 
continental calls originating on the U. S. east coast were under- 
represented in the 1962 survey because of the different time zones 
on east and west coasts. Such under-representation does not exist in 
the 1966 survey. 


HI. SAMPLING PLAN AND SAMPLE SIZE 


The sampling plan can be described as a two-stage plan with pri- 
mary stratification and substratification and with the primary units 
selected with probabilities proportional to measures of size.*> The 
primary units were identified with Bell System end-office buildings. 
Two primary strata were defined, based on the size of the primary 
units. One of these strata contains those buildings in which at least 
400,000 toll messages originate annually; the other contains the re- 
maining smaller buildings. 

The first-stage sample contains 40 end-office buildings. Twenty- 
five of these were selected from the stratum with large offices, and 
fifteen from the small offices stratum. The sample units in the two 
strata were selected independently from lists that contained the total 
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of 9052 Bell System end-office buildings that were in service on Janu- 
ary 1, 1964. 

For each of the selected primary units, information was acquired 
about the outgoing toll traffic during the busy period of an ordinary 
business day. This information consisted of lists of terminating end- 
points of toll calls originating in the sample office during the indi- 
cated time period. Every call in each of these lists was assigned to 
one of three substrata. The substratification was based on the airline 
distance between originating and terminating end offices. Toll calls 
shorter than about 180 miles were assigned to substratum one, while 
calls longer than about 725 miles were assigned to substratum three. 

Independent selections of sample elements were made in each of the 
substrata for each sampled primary unit. The aim of the substrati- 
fication was to achieve a sample size that would give acceptable 
precision in the estimation of transmission performance for toll calls 
in each of a number of mileage categories. The success of this en- 
deavor is demonstrated by the confidence interval widths listed in 
the various tables of Section V. | 
_ An approximately equal number of toll calls was selected into the 
sample in each sample office. The resulting sample is not self-weight- 
ing. This means that different sample toll calls in general carry 
different weights in the estimation of population characteristics. The 
sample contains a total of 1463 calls. Of these, 476 have an airline 
distance between end offices up to 180 miles, while 554 are between 
180 and 725 miles long, and 433 calls are longer than 725 miles. 


IV. METHOD OF MEASUREMENT 


The measurement procedure in the survey was similar to that used 
in 1962. Thus, the aim of the measurement phase was to duplicate 
the calls included in the sample and make transmission measurements 
in the receive direction on the established connections. In addition, 
the time required to establish the connection was noted. 

All survey connections were established from an ordinary tele- 
phone set connected via a test set to a zero loop in the originating 
central office. The test set consisted of coils and switches and allowed 
the telephone set to be switched out of the connection and be con- 
veniently replaced by a suitable measurement instrument. This test 
set and the transmission measuring equipment used in the survey 
are manufactured by the Western Electric Company for Bell System 
use only. _ 
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Two separate connections were established for each call included 
in the sample. One of them was made to the balanced (quiet) ter- 
mination in the distant central office, and the other was made to the 
far-end milliwatt supply. The first one allowed the measurement of 
noise on the connection. The 8A noise measuring set® was used, and 
two readings were taken: one with C-message weighting and the 
other with 3 kHz flat weighting. As in the 1962 connection survey, 
no information about the physical routing of the call was acquired, 
and the measured noise levels did not include the subjective penalty 
due to the possible presence of compandored carrier facilities in the 
connection. 

The second connection was established to record the 1000 Hz loss. 
The received level was measured with a transmission measuring set 
and recorded to the nearest tenth of one dB. The peak-to-peak phase 
jitter of the received signal was measured on the same connection 
with a voiceband phase jitter meter. The calls to the milliwatt sup- 
plies were also used to acquire information about time to connect. 
This time was measured as the time elapsed after the last digit had 
been dialed or after the conversation with the operator was finished 
until the test tone or a ringback signal was heard. 

All of the terminating end offices for the sample calls were not 
equipped with balanced terminations or milliwatt supplies. In order 
to allow measurements to be made, such sample calls were replaced 
by calls that terminated in an end office geographically close to the 
desired one, and equipped with proper test lines. Replacements of 
this type were made on somewhat less than 10 per cent of the sample 
calls. 


V. SURVEY RESULTS 


The survey results presented here have all been evaluated by com- 
puter’ programs based on sample survey evaluation formulas con- 
tained in Ref. 4. The transmission results give noise, loss, and phase 
jitter as measured across a 9000 termination on a zero length loop. 


5.1 3A Notse with C-Message Weighting 


A scatter diagram showing observed 3A noise levels with C-message 
weighting as a function of the airline distance between end offices is 
contained in Fig. 1. The previously observedt general trends of 
increasing mean and decreasing standard deviation as the call dis- 
tance is increased is visible from this figure. These trends are ex- 
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Fig. 1—Scatter diagram of 3A noise level (C-message weighting) vs airline 
distance. 


plained qualitatively by reference to the theory of power sums of 
random variables. The noise level on a toll connection can be re- 
garded as the power sum of noise levels from a number of different 
noise sources, and with the number of noise sources increasing with 
eall distance. Recent results by Marlow’ and Nasell® show that the 
mean of a power sum increases with the number of components, 
while the standard deviation of the power sum decreases as the 
number of components is increased, in line with the trends observed 
in Fig. 1. 

The regression line in Fig. 1 gives an estimate of the mean noise 
level under the assumption that the mean noise level is linearly 
related to the logarithm of the airline distance between end offices. 
The equation for the regression line is 


N = 12.6 + 2.0 log, D (1) 
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where D is the airline distance between end offices in miles, and NV 
is the average 3A noise level. This equation shows that the average 
noise level increases by 2.0 dB for each doubling of the airline dis- 
tance between end offices. The fact that the variance changes with 
distance has been accounted for in the regression analysis; weights 
were applied in inverse proportion to the variance about the regres- 
sion line. 

A summary of the results for 3A noise levels with C-message 
weighting is contained in Table I. As in most tables in this section, 
estimates are given of the mean and the standard deviation of the 
population distribution, and the mean is equipped with its 90 per 
cent confidence interval. Table I gives such results for each of eight 
mileage categories. These categories (except the first) are one double 
distance wide. The first four taken together correspond to the cate- 
gory referred to as “short” (0-180 miles) by D. A. Lewinski,? the 
next two cover the “medium” length and the last two contain the 
“long” calls (longer than 725 miles). The tendency for the mean 
to increase, and the standard deviation to decrease with distance is 
clearly demonstrated in this table. 

The noise distributions discussed here are all very close to normal. 
No significant difference was found between mean noise levels on 
operator-handled calls and mean noise levels on direct-dialed calls. 

A comparison between noise level distributions observed in the 
1962 and the 1966 connection surveys is made in Table II. The table 
indicates improved noise performance of the toll plant in the inter- 
vening period; both means and standard deviations show generally 
lower values in 1966, and the difference between means in the long 
category is statistically significant. The results given for the 1962 


TABLE [—SuMMARY OF RESULTS FOR 3A 
Noise LEVELS with C-MeEssaGeE WEIGHTING 


Airline Std. dev. 
distance Mean dBrnC (dB) 
(miles) 

0-23 19.8 +1.0 6.2 
23-45 21.9+1.7 6.5 
45-90 22.4+41.6 6.1 
90-180 25.8 + 1.4 5.3 

180-360 28.9 41.0 4.3 
360-725 31.0 + 0.8 3.6 
725-1450 31.1+41.3 4.2 
1450-2900 34.6 + 0.9 3.1 
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TABLE II — Comparison oF REsuuts For 3A NoIsEé WITH 
C-MEssAGE WEIGHTING FROM THE 1962 AND 1966 SURVEYS 


Airline 1962 Survey 1966 Survey 

distance |__| __-__—_—_—_—— ae 

(miles) Mean dBrnC . ei Mean dBrnC (dB) 

0-180 23.4 + 2.6 7.4 21.6 + 0.8 6.4 
180-725 31.0 +1.2 5.3 29.6 + 0.7 4.2 
725-2900 35.8 + 1.5 4.0 32.5 + 1.0 4.1 


survey deviate slightly from those quoted by Lewinski.2 The reason 
is that Lewinski’s numbers are based on a sub-sample, while the 
results in Table II are not. The differences are well within the con- 
fidence intervals. 

Table II also illustrates the improved precision achieved in the 
1966 survey compared with the precision of the 1962 survey. 


5.2 8A Noise with 3 kHz Flat Weighting 


A scatter diagram of 3A noise levels with 3 kHz flat weighting as 
a function of the airline distance between end offices is shown in Fig. 
2. It indicates much less of a distance dependence of the observed 
noise levels than that shown in Fig. 1. This is to be expected since 
flat weighted noise readings are predominantly caused by low-fre- 
quency noise components that fall below the lower cutoff frequency 
of most carrier facilities used in the toll plant. 

A summary of the results for 3A noise with 3 kHz flat weighting 
is given in Table III. The table reinforces the impression that the 
distance dependence of both mean and standard deviation is very 
slight. It does, however, bring out the fact that both means and 
standard deviations of operator-handled calls are larger than those 
for direct-dialed calls. This fact is believed to be related to differ- 
ences in local trunking arrangements. All of the distributions of flat 
weighted noise levels have a moderate amount of positive skewness. 


5.3 1000 Hz Loss 


The end-office to end-office loss at 1000 Hz is shown as a function 
of distance in the scatter diagram of Fig. 3. Just as was the case in 
the 1962 survey, we find the distance dependence of the loss to be 
only moderate. Table IV summarizes the results for each of the eight 
mileage categories discussed above. A small trend for both mean 
and standard deviation to increase with distance is seen to exist. 
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Fig. 2—Scatter diagram of 3A noise level (3 kHz flat weighting) vs airline 
distance. 


This is related to the higher probability of encountering more than 
one intertoll trunk in tandem for the longer connections. All the 
loss distributions deviate somewhat from normality through a mod- 
erate amount of positive skewness. Loss values exceeding 20 dB 
were found both on operator-handled and on direct-dialed calis. 

Operator-handled calls will in general require one more trunk for 


Tasie III —Summary or REsutts For 3A Noise wits 3kHz 
Fuat WEIGHTING 


es Over-all Operator DDD 
Airline 
CIStAN CO | ee ee ee ee 
_ (niles) Mean dBrn | Std. dev. | Mean dBrn | Std. dev.; Mean dBrn | Std. dev. 
vee es (3kHz flat) (dB) (3kHz flat) (dB) (8kHz flat) (dB) 


Cn esa) 


‘9-480 1439416] 7.4 1467431) 9.1 | 42541. 
180-725 1459424] 7.6 |478440) 8.8 | 43.641. 
_ 725-2000 | 45.2 £15) 6.0 | 46542.5| 7.0 | 43.9 41. 
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Fig. 3 — Scatter diagram of 1000 Hz loss vs airline distance. 


TasBie [LV—SumMary oF REsutts For ENp- 
OFFICE TO END-OFFICE Loss at 1000 Hz 


Airline Std. dev. 
aia Mean (dB) (dB) 
0-23 6.8 + 0.6 2.4 
23-45 7.724 0.5 2.6 
45-90 7.1+40.7 2.6 
90-180 7440.6 2.8 
180-360 8.7+0.6 2.8 
360-725 9.4+1.0 2.9 
725-1450 9.52+0.4 2.9 
1450-2900 9.7+40.8 3.0 
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TABLE V—CoMPARISON OF Loss DISTRIBUTIONS FOR 
OPERATOR-HANDLED AND D1IREcT-DIALED CALLS 








Airline Operator DDD 
distance = 
(miles) Mean (dB) ae Mean (dB) Std; dev. 
0-180 7.5+0.6 3.0 7.0+0.4 2.3 
180-725 9.3240.8 3.1 8.5+0.6 2.5 
725-2900 10.2 + 0.6 2.7 8.9 + 0.6 3.0 


their completion than direct-dialed calls. The total loss on the con- 
nection is, therefore, expected to be somewhat higher on operator- 
handled than on direct-dialed calls. A comparison between the loss 
distribution parameters on the two types of calls is made in Table 
V. The table shows a lower mean loss on DDD calls in each of the 
three mileage categories, and in the third category the difference is 
significant. The mean loss difference is seen to range from 0.5 dB 
for short calls to 1.8 dB for long calls. No rationale is known for a 
distance dependence of this loss difference. 

A comparison of means and standard deviations of loss distribu- 
tions observed in the 1962 and 1966 surveys is made in Table VI. 
No large changes in the intervening time period are indicated. 


5.4 Phase Jitter 


The phase jitter measurements in the survey reveal the amount of 
phase modulation that an unmodulated sinusoidal carrier of 1000 Hz 
is subjected to on a toll connection. These measurements were in- 
cluded since certain types of data transmission are susceptible to 
phase modulation of transmitted signals. The measurements give 
the peak-to-peak phase jitter in degrees for jitter components be- 
tween 10 Hz and 120 Hz on the signal transmitted by the far-end 


TABLE VI—ComPARISON OF Loss DISTRIBUTIONS FROM 
THE 1962 AND 1966 SURVEYS 


1962 Survey 1966 Survey 
Airline 
distance Mean (dB) Std. dev. Mean (dB) Std. dev. 
(miles) (dB) (dB) 
0-180 7.3°+ 0.6 2.8 7.2+40.4 2.6 
180-725 8.9+0.7 3.0 8.9+0.7 2.9 
725-2900 9341.4 3.8 9.6+0.5 2.9 
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1000 Hz milliwatt supply. A scatter diagram of observed phase jitter 
versus connection distance is contained in Fig. 4. The connections 
for which a phase jitter of 21 degrees is indicated are connections 
where the phase jitter measurement was larger than or equal to 21 
degrees. A trend for the average phase jitter to increase with mileage 
is indicated by the figure. The phase jitter distributions are definitely 
not normal with a high amount of positive skewness. Because of this, 
the summary data in Table VII give 10-, 50-, and 90-percent points 
of the phase jitter distributions rather than means and standard 
deviations. 

Operator-handled calls that are of short and medium length show 
a significantly higher median phase jitter than direct-dialed calls of 
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Fig. 4 Scatter diagram of phase jitter vs airline distance. 
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TABLE VIJ—SuMMARY OF RESULTS FOR 
PEAK-TO-PEAK PHASE JITTER* 


Airline distance Phase jitter (degrees) 
(miles) | 

10% 50% 90% 

0-23 1 3 7 
23-45 1 3 7 
45-90 2 4 15 
90-180 2 7 14 
180-360 2 7 20 
360-725 2 11 21 
725-1450 4 12 20 
1450-2900 3 12 21 





__* The table gives the 10-, 50-, and 90-per-cent points (in degrees) of the phase 
jitter distributions in each mileage category. 


corresponding length, while no apparent difference exists for long 
calls. A numerical comparison is made in Table VIII. 


5.5 Time to Connect 


The time to connect is shown versus distance in the scatter diagram 
of Fig. 5. A range up to 100 seconds is used to cover some operator- 
handled calls that suffered long delays. The scatter diagram shows 
a tendency for the average time to connect to increase with distance. 
This is a reflection of the higher average number of intertoll trunks 
in tandem for the longer connections, which in turn means that a 
larger number of switching offices is involved in establishing the 
longer connections. 

A separation of operator-handled calls from direct-dialed calls is 
made in Table IX. It shows that the average time to connect is 
longer for operator-handled calls than for direct-dialed calls. It also 


TABLE VIIJ — PEaxk-ro-PEAK PHASE JITTER FOR OPERATOR- 
HANDLED AND Direct-DIALED CALLS* 





Phase jitter (degrees) 








Airline Se ep ep ee ee . 
distance Operator DDD 
(miles) a Sn a a et 
10% 50% 90% 10% 50% 90% 
0-180 2 4 11 1 2 9 
180-725 3 11 21 2 6 18 
725-2900 5 11 20 3 12 21 





* The table gives the 10-, 50-, and 90-per-cent points (in degrees) of the phase 
jitter distributions in each mileage category. 


TIME TO CONNECT IN SECONDS 
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Fig. 5 — Scatter diagram of time to connect vs airline distance. 


shows that the average time to connect is virtually independent of 
distance for operator-handled calls, while a definite trend exists for 
direct-dialed calls. Finally, we notice that the standard deviations 
are considerably higher for the operator-handled calls than for those 
that are direct-dialed. For these reasons, a detailed study of the time 
to connect for direct-dialed calls is of interest. 


TABLE IX — CoMPARISON OF DISTRIBUTIONS OF TIME TO 
CONNECT FOR OPERATOR-HANDLED AND DDD CaLts 


Airline 
distance 
(miles) 


0-180 
180-725 
725-2900 


Time (seconds) 

















Operator DDD 
Mean Std. dev. Mean Std. dev. 
24.7 +4.2 21.1 11.1 40.9 4.6 
27.0 + 4.5 20.5 15.6 + 1.0 5.0 
24.84 2.4 11.1 17.6 + 2.1 6.6 
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A scatter diagram of time to connect versus distance is given for 
DDD calls in Fig. 6. The regression line shown has the equation 


T = 7.6 + 0.9 log. D (2) 


where D is the airline distance between end offices in miles, and T 
is the average time to connect in seconds. The regression equation 
shows that the average time to connect increases by 0.9 seconds for 
each doubling of the airline distance between end-offices. 

A summary of the parameters of time to connect distributions for 
DDD calls is given in Table X. The table indicates that the regres- 
sion assumption of a linear relation between the mean time to con- 
nect and the logarithm of the airline distance may be an oversim- 
plification; the mean time to connect is virtually constant in the 
first three and in the last two mileage categories; in between it 
increases by more than 0.9 seconds per double distance. 

The distributions of time to connect over all calls have a high 
positive skewness as indicated by the scatter diagram in Fig. 5. On 
the other hand, only a small amount of skewness is present in the 


40 


W 
° 


TIME TO CONNECT IN SECONDS 
3 8 





2 5 10 20 SO 100 200 500 i000 2000 5000 
MILES 


Fig. 6—Scatter diagram of time to connect on direct-dialed calls vs airline 
distance. 
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TABLE X—SUMMARY OF RESULTS FOR TIME 
To CoNNECT oN DDD Cats 
Airline Time (seconds) 


distance —————_—; ————__—_—_—_—_—_§_—- 
(miles) Mean Std. dev. 
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distributions for DDD calls, as seen from the scatter diagram in 
Fig. 6. 


5.6 Distance Distribution 


The distribution of airline distances between end offices of toll 
calls is given in Fig. 7. The distribution is seen to deviate somewhat 
from a log-normal distribution, and it is virtually truncated at 2500 
miles. Table XI gives estimated percentages of toll calls that fall in 
each of the eight mileage categories. A comparison with the results 
from the 1962 survey shows no important changes. The fact that 
only about four per cent of all toll calls are longer than 725 miles 
illustrates a problem for the design of the sampling plan. Unstratified 
sampling would tend to give a sample in which only about four per 
cent of the sample calls exceed 725 miles in length. In contrast to 
this, precision requirements dictate approximately equal sample size 
for short, medium, and long calls. The problem was solved, as men- 
tioned before, by the use of substratification based on the airline 
distance between end offices of toll calls. 


VI. CONCLUDING REMARKS 


The 1966 connection survey represents an improvement over the 
1962 survey in terms of precision. It also represents a small extension 
of the measurement program, to include measurements of such en- 
titles as phase jitter and time to connect. It does, however, suffer 
from certain limitations, which it shares with the 1962 survey. Most 
important is the fact that a number of important transmission param- 
eters, such as frequency response, delay distortion, and impulse noise, 
were not measured. An additional limitation is that the milliwatt 


1016 THE BELL SYSTEM TECHNICAL JOURNAL, JULY-AUGUST 1968 


99.9 





99 











90 











PERCENT OF TOLL CALLS SHORTER THAN ABSCISSA 





2 5 10 20 50 100 200 500 1000 2000 5000 
MILES 


Fig. 7 — Distance distribution of toll calls. 
TABLE XI — DisTrancE 
DISTRIBUTION OF ToLL CALLS 


Percent of calls in distance class 


Airline 
diatance 1966 Survey 1962 Survey 
miles 
0-23 33.7 | 
23-45 20.0 
———| 83.7 85.0 
45-90 18.2 
90-180 11.8 
180-860 8.0 
—_——_—____—__ |__| 12.4 11.0 
360-725 4.4 
725-1450 2.3 
——_—_—__—_—_____|——__| 3.9 4.0 


1450-2900 1.6 
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signal source at the far end of each call could not be calibrated. 

The use of specially-equipped test teams at both ends of the con- 
nections would alleviate both of these limitations. Studies are, there- 
fore, under way to investigate the feasibility of using a 3-stage 
sampling plan in place of the 2-stage plan that was used in the 1966 
survey. The main accomplishment of the 3-stage plan would be to 
limit the number of far-end end offices involved in the sample con- 
nections, thereby reducing the total traveling cost. 

A toll connection appraisal program has recently been introduced 
in the Operating Companies of the Bell System. The procedures of 
this program are similar to those used in the connection survey 
described here. However, the main purpose of this appraisal pro- 
gram is to provide data to aid in the location of weak spots and also 
to aid in managerial decisions affecting the transmission performance 
of the present plant. In contrast to this, the data collected in the 
connection survey will find its main application in systems engi- 
neering studies conducted at Bell Laboratories and elsewhere in the 
Bell System. 

It might be surprising that a sample of only 1468 calls originating 
in 40 end offices suffices to estimate the transmission performance 
of the 15 million toll calls that originate each day in one of more 
than 9000 end-office buildings. The results presented here show, how- 
ever, that the achieved precision is indeed acceptable for a number 
of engineering applications. This fact demonstrates very concretely 
what can be achieved for data-acquisition purposes by a judicious 
application of the powerful methods of modern sample survey theory. 
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Negative Impedance Boosting 


By L. A. MEACHAM 
(Manuscript received February 6, 1968) 


Linearized and feedback-stabilized negative impedance circuits having 
only R, C, and solid state components, powered in sertes at intervals along 
a cable pair, offer new possibilities in bilateral transmission. After dis- 
cussing the basic negative impedance boosting units and the transmission 
characteristics they tmpart to a line (computed, with experimental con- 
jirmation), this paper describes a field test of two 32-mile telephone lines, 
largely 22-gauge, each having an insertion loss of only 3 dB at 1,000 
Hz. It also shows means for broadening bandwidth and almost eliminating 
delay distortion over negative impedance boosted lines. Treatment of this 
sort adapts them to unusual uses. Examples include converting rectangular 
to raised-cosine pulses in transmission, without pulse-forming circuitry, 
and the bilateral two-wire transmission of carrier or pulse signals in both 
directions simultaneously, without frequency separation. 


I. INTRODUCTION 


The insertion of lumped negative impedances at intervals along 
each conductor of a cable pair has long been of interest as a means 
of improving bilateral transmission. In the familiar expressions for 
propagation constant 





y= at jB = AR + joL)\G + j0C) (1) 
and characteristic impedance 
Zo = Ro + jXo = VR + jol)/GE + jo), (2) 


if one lets both G and R go to zero on presumption that the shunt 
conductance of well-insulated cable is negligible and that the copper 
resistance can effectively be canceled by active devices, he encounters 
four challenging approximations: 


a 0, BY wVJLC, Ro = VL/C and X, 0. (3) 
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To the extent of their accuracy these describe lossless transmission, 
free of phase distortion, between matching terminations that are 
resistive and independent of frequency. Such properties would indeed 
be of value in either analog or digital transmission.* 

In the early 1940s effort toward canceling R was devoted to high 
speed point-contact thermistors as the requisite “current-controlled” 
or “open-circuit-stable” negative impedance elements,? but lack of 
stability and uniformity were severe obstacles. Similar handicaps were 
later encountered with other devices such as avalanche transistors.* 
At least partly for such reasons, development eventually tended to 
abandon the scheme of distributing bilateral active elements along 
a pair over which they could also be powered, and instead moved 
toward combinations of shunt and series type negative impedances 
(transformer coupled, locally powered, and designed to match the 
cable in characteristic impedance) that could be installed at con- 
venient points such as in central offices, and there contribute modest 
amounts of bilateral gain. A well-known outcome was the E-type 
repeater,* of which both vacuum-tube and transistor versions have 
found extensive use in the exchange plant of the Bell System. 

Recently, however, a new look has been taken at negative im- 
pedance boostingt (NIB). This paper outlines in chronological 
order various findings of a small research project that has been in 
progress for several years at Bell Laboratories. 


II. BASIC NIB CIRCUIT 


An NIB unit devised early in this study and used as a basic tool 
appears schematically in Fig. 1. Figure 2 shows its d-c V-I charac- 
teristic and equivalent circuit. For convenience the latter represents 
the total impedance Z, of a pair of units, one in series with each 
conductor, at a boosting point. 

Accordingly, for small currents (below the first bend of the char- 
acteristic) —R, = +2R3 and R, = 4R.. At that bend the silicon 
transistors begin to conduct, while at the second bend they saturate. 


* As early as 1887 Oliver Heaviside defined a “distortion constant” (R/L — 
G/C) = 2¢ and an “attenuation constant” (R/L + G/C) = 28, and showed 
that distortion could be “annihilated” by increasing G to make G/C = R/L’. 
He undoubtedly would have stressed the benefits of making both o and 6 
approach zero, had he known of any way to reduce R except the use of more 
copper. 

+ “Boosting” is proposed as a better term than “loading,” on the grounds 
that the mass/inductance analogy suggested by the latter is irrelevant. 
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Fig. 1— Circuit schematic of basic negative impedance booster unit. 


In the active region between bends simple circuit analysis shows that 


4RLR 
R, = >> 4 
o> PR,” (4) 
: _op | Ree — 1) - Ba | 
fia or] RoeR.  Ole ©) 
and 

a _ BCs, 

GC. aa = ie (6) 


Here a, the usual ratio of collector to emitter current, is assumed con- 
stant and the same for both transistors. Expression (5) tacitly takes 
into account the nonlinearity of the emitter junctions in Fig. 1; this 
follows from the fact that the voltage across each emitter junction is 
compensated, except for an approximately constant voltage difference 
of about 0.5 volt, by the drop across a germanium junction diode car- 
rying a proportional and almost equal current. The 0.5-volt difference, 
inherent between silicon and germanium, effectively affords a bias 
essential to the circuit. The drop across R, equals this bias at the first 
bend, and to a close approximation exceeds the drop across R, by the 
same value of 0.5 volt throughout the active region. The important 
result of this compensation is a high degree of linearity between the 
bends, which correspondingly are sharpened almost into cusps. 


III. BASIC NIB LINE 


Some basic features of an NIB line are illustrated in the telephone 
customer’s loop of Fig. 3. The boosters have a spacing that is (prefer- 
ably) regular and not much greater than one quarter wavelength at 
the top of the transmission band. For telephone speech, a suitable 
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Fig. 2— DC characteristic and equivalent circuit of basic NIB unit. 


spacing would be 12,000 feet. In general, boosting gives the line a 
characteristic impedance substantially lower than that of ordinary 
nonloaded or inductively loaded telephone lines. Hence the line circuit 
at the central office includes an impedance-matching transformer, as 
well as means for regulating the d-c loop current roughly at the center 
of the active region of the V-I characteristic. The telephone set can 
be conventionally powered by this current, and should have a resis- 
tive impedance, preferably matching that of the line. 

Stability criteria are well known® for such arrangements. In prac- 
tical terms, for regularly spaced NIB units with the equivalent cir- 
cuit of Fig. 2, the system is found stable (experimentally and by 
computer) when the net d-c variational resistance (AV/AI) of the 
loop, including its terminations, is positive, provided that the time 
constant T,, = R,»C, is greater than a certain critical value. In this 
study (except where noted) we have consistently made 


TR = 1R + Rk, — R, = 0, 


where F# is the copper resistance per unit length of cable and I is the 
NIB spacing. The negative capacitor —C, bypasses —R,,, and with 
rising frequency gradually reduces the negative real component of 
terminal impedance of the NIB unit. One way of visualizing the need 
for such reduction is to notice that the positive copper resistance 
adjacent to each of the four terminals of the two NIB units at a 
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boosting point is also effectively reduced, being bypassed by the mu- 
tual line capacitance. Hence with rising frequency, a point of insta- 
bility is almost certain to be reached unless the negative resistance 
diminishes at least as fast as the copper resistance as seen from the 
NIB terminals. 

T,, 18 therefore an important parameter of the NIB circuit. Increas- 
ing it raises the margin of stability, but at the penalty of reducing 
transmission bandwidth. The midspacing image impedance of the 
line is also affected by T,. It is found that when the line conductance 
per section (1G) is negligible, and when the line resistance per section 
(WR) is exactly compensated by R, — R,, the midspacing image 
impedance Zy remains essentially constant and resistive as the fre- 
quency falls toward zero. As shown in the Appendix, the value it thus 
approaches is given precisely by 











ee 2 (Re ee 


where R, L and C are the usual primary cable constants (per unit 
length). R, enters (7) implicitly, being the difference between Ff, 
and JR. 

A related effect of T,, is upon the phase velocity Vy = o/Bx, which 
also approaches an asymptotic value: 


V'= lim Via = ag (8) 
[<t/2-> 





! 


—— 
ie SUPERVISORY 
RELAY 





Fig. 3— Negative impedance boosted subscriber line and central office ter- 
minating circuit. 
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Expressions (7) and (8) are useful, for the values they give hold 
approximately over a major part of the low-loss frequency band. As 
an example, take the case of 12,000 -foot (2.2727-mile) NIB spacing, 
along 22-gauge BSA cable that has the primary constants (at low 
frequency) R = 173 ohms per mile, L = 0.874 X 10-° henry per mile, 
and C = 0.825 x 10-° farad per mile. For the NIB parameters (per 
section) R, = 97.3 ohms, Rk, = 490.5 ohms, and T, = 16 x 10° 
second, expressions (7) and (8) tell us 


Z’ = 198 ohms 
and 
V’ = 61,200 miles per second. 


For this velocity, the spacing becomes a quarter wavelength at the 
frequency 


fv = V'/4 = 6,730 Hz. 


TV. COMPUTED CHARACTERISTICS 


Computer programs have been worked out to give propagation 
constant and midspacing image impedance as functions of frequency, 
for any set of cable primary “constants” (which of course actually 
vary with frequency) and NIB equivalent circuit parameters. Some 
typical results, plotted in Figs. 4, 5, 6, apply to the set of parameters 
used in the foregoing example. For comparison, characteristics are 
included for nonloaded (NL) and loaded (H88) cable, also of 22 
gauge. (H88 loading uses 88 mH inductors at 6,000-foot intervals.) 

Among varied uses of these programs has been the finding, by suc- 
cessive approximations, of the minimum or “just stable” time constant 
(jstc) for various gauges and NIB spacings. Figure 7 shows attenuation 
constant versus frequency for the jste condition and also for a timc 
constant 10 per cent greater. To illustrate another use, the effect 
upon attenuation of moderate over- or undercompensation is pictured 
in Fig. 8. Here the loss per mile between image impedances is shown for 
errors in compensation of +20 ohms, or approximately +5 per cent 
of the copper resistance. Over most of the useful band, these errors 
introduce almost flat gain or loss of about 0.2 dB per mile. Their effects 
upon phase velocity and image impedance (not plotted) are small 
except at frequencies below 500 Hz.* 

*In that region the variation of image impedance is such that if Fig. 8 were 
a plot of insertion loss between 198-ohm resistive terminations, it would show 


the gain or loss of 0.2 dB per mile extending almost unchanged all the way 
down to zero frequency. 
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Fig. 4—Attenuation constant of 22-gauge BSA cable; nonloaded, H-88 
loaded, and negative impedance boosted. 
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Fig. 5— Phase velocity of 22-gauge BSA cable; nonloaded, H-88 loaded and 
negative impedance boosted, 
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Fig. 6— Characteristic impedance or midspacing image impedance of 22-gauge 
BSA cable; nonloaded, H-88 loaded, and negative impedance boosted. 


In general, our laboratory tests using either dependably representa- 
tive artificial lines, or pairs in actual cable on spools, confirmed the 
computed results very accurately. Conversation over lines several 
12,000-foot NIB sections in lenglh was found highly satisfactory— 
remarkably free of hum, echo, and distortion. But the need was seen 
for experience with NIB transmission under actual field conditions. 


v. “ROUND ROBIN” FIELD TEST 


With the cooperation of the New Jersey Bell Telephone Company 
two NIB lines were set up using pairs in existing interoffice cables 
over the route shown in Fig. 9. For convenience of measurement, 
both ends of each line were brought to the same room at the Murray 
Hill, New Jersey, branch of Bell Laboratories. Experimental ap- 
plique circuits were provided for coupling to the Murray Hill PBX, 
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permitting each pair to serve as a regular telephone extension when 
not in use for other tests. 

The cable, 32.4 miles long, was all of 22 gauge except for 0.5 per 
cent of 24 and 4.2 per cent of 26 gauge. Seventy-seven per cent of 
its length was underground, the rest aerial. All boosting points, one 
for each of 16 sections ranging from 9,750 to 13,380 feet long, were 
in manholes. There the NIB units were plugged into jacks within 
containers that could be conveniently opened and resealed, taking 

advantage of equipment already installed (for housing regenerative 
. repeaters of the Tl type PCM transmission system). 

The NIB circuits were adapted to field conditions in the following 

ways: 


(1) By giving Rs an appropriate positive temperature coefficient, 
the net coefficient of each NIB unit was matched approximately to 
that of copper. It was recognized that this compensation would be 
reasonably accurate for underground cable, but: little better than 
seasonal for aerial. 

(it) Taps were provided along R3 so that any one of four values of 


Ty =JSTC +10% 
= 14.74 MS 











Ty = JSTC =13.40p5S 
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Fig. 7— Attenuation constant of line with NIB time constant at or near “just 
stable” value; 22-gauge BSA cable, NIB spacing 12,000 feet. 
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—R, could be selected by strapping, as a best fit for the section resis- 
tance. No corresponding adjustment of Cs proved necessary, as the 
image impedance fortunately turned out to be kept almost constant 
by the related changes in 7,,, —R, and l. 

(wit) To increase stability margins in view of the nonuniform NIB 
spacing, 7’, was raised to 20us for the mean length of 22-gauge sec- 
tion. This gave an image impedance Z’ of 225 ohms. 

(2v) For the two end sections of each line, which happened to in- 
clude all the 26-gauge cable, 7, was adjusted by changing the capaci- 
tor Cs (Fig. 1) to make the image impedance roughly equal to that 
of other sections (225 ohms). 


Except for these adjustments, the NIB units had the equivalent 
circuit parameters listed in the discussion of expressions (7) and (8). 
They were normally powered by 16 mA of loop current, with their 
linear negative slopes extending from 6 to 26 mA. This range was 
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Fig. 8 — Effect of over- or undercompensation of copper resistance; 22-gauge 
BSA cable, spacing 12,000 feet, 16us time constant. 
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Fig. 9— Cable route in field test of negative impedance boosting. The line was 
32 miles long, mostly buried 22 gauge cable, and it had 16 NIB sections. 


twice as great as required for telephone speech; the excess was an 
allowance for possible hum current. The total IR drop in copper and 
NIB units of either loop was about 186 volts; hence, with an additional 
4-volt drop across a 225-ohm resistive station set, the potentials on 
tip and ring conductors at the “office end” were approximately +95 
volts from ground. 

Touch-Tone® calling was used on one line, rotary dialing on the 
other. The severe distortion occurring when the rotary dialing pulses 
were produced by complete interruption of loop current was rem- 
edied by having the dial merely insert enough resistance to drop 
the current from 16 to 6 mA (the regulator going out of range). With 
the NIB thus left operative, dial pulse distortion became negligible. 

Tone ringing® was used on both lines, the signal being a 1,000 Hz 
wave interrupted at 10 Hz. This was applied with a level of about 1 
mW at the applique line circuit, under control of the ordinary ringing 
signal from the PBX. 

Supervision was conventional. The current regulator was so designed 
that when the path was broken by the switchhook the open-circuit 
voltage on the loop did not greatly exceed the +95 volt figure. ‘A relay 
in the applique, responding to the switchhook (and dial pulses) trans- 
ferred the information to the PBX pair. 

The performance of the NIB lines was gratifying. People conversing 
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_ Fig. 10— Insertion loss of 32-mile field-test NIB line between 225-ohm resis- 
tive terminations. 


over them were favorably impressed by resemblance of the transmis- 
sion to that over a short loop, and by freedom from noise, hum, 
crosstalk and distortion. Fig. 10 shows the insertion loss of one 32.4- 
mile line measured between 225 ohm resistive terminations. It also 
shows a computed plot of this loss, using a program that takes 
account of the individual dimensions of each section and NIB unit. 

To help ensure stability in spite of the inherent restrictions on 
temperature compensation, the total copper resistance (6,000 ohms) 
was intentionally left undercompensated by about 100 ohms. As a 
result, the insertion loss had a low-frequency asymptote of roughly 
2 dB. Strip chart records of a 1 kHz test tone showed the transmis- 
sion varying over a typical day and night by about +0.5 dB. Neither 
line lost stability at any time during the entire test, which extended 
over four fall and winter months and encountered large and rapid 
changes of weather. 

Figure 11 shows the mput impedance of one line, measured and 
computed, for a 225-ohm resistive far-end termination. The irregulari- 
ties of these plots, resulting from nonuniformity of the sections, 
correspond to echo return losses no smaller than 12 dB, and exceeding 
17 dB over most of the band. 

Crosstalk loss between the two lines was roughly 88 dB at 1 kHz; 
there was little difference between near-end and far-end measurements. 

In planning the field test, hum was of course recognized as a pos- 
sible source of trouble. It was known that hum is generally introduced 
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by magnetic induction from power lines, effectively generating equal 
voltages in series with each conductor. Longitudinal hum currents, 
impelled by these voltages, could trouble the NIB transmission in 
two ways: by using up a significant part of the operating range of 
the NIB units, and by coupling into the metallic circuit as a result 
of unbalance between the two sides of the line. 

Experience and measurements afforded by the test were encourag- 
ing, but not extensive enough to be conclusive. In order to minimize 
hum currents, station grounds were avoided; the only path to ground 
was via capacitance distributed along the line. At the central office 
end, the longitudinal termination to ground was roughly matched to 
the longitudinal impedance of the line, to avoid possible accumulation 
of multiple reflections. With this arrangement line balance was found 
adequate to prevent more than a trivial hum level from ever being 
coupled into the telephones. 

Hum voltage to ground (largely 60 Hz) recorded at the station 
end was found to vary from minute to minute as well as over a daily 
cycle. The extreme range of these measurements was from 1.8 to 6.5 
volts rms, the largest values occurring around 5 to 6 pm. Without 
knowing the distribution of magnetic induction along the line, one 
could not determine hum current from such measurements. However, 
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Fig. 11— Input impedance of 32-mile NIB line with 225-ohm resistive far-end 
termination. 
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by varying the d-c loop current away from its usual 16 mA value 
until current peaks “bumped” an edge of the NIB dynamic range 
(putting audible 120 Hz pulses into the metallic circuit) one could 
readily measure the maximum hum current, reached at some boosting 
point along the line. Typical measurements of this sort gave values 
around 5 mA peak-to-peak in each conductor, or 25 per cent of the 
20 mA dynamic range; under worst conditions at least half the range 
was undoubtedly filled. Although this amount of hum was found to 
have no noticeable effect on telephone speech, larger hum currents 
would probably be encountered at other locations. 

Experience with lightning was also encouraging although far from 
comprehensive. The NIB units were left unprotected except by their 
own fairly low resistance at large forward currents, and by diodes to 
bypass reverse currents. No damage was done by thunderstorms, 
several of which did occur during the field run. Not until after these 
tests was it recognized that valid protection against large forward 
currents also could have been provided by merely giving each bypass- 
ing diode a zener potential of around 10 volts. Of course, this value is 
chosen to exceed the drop across the NIB at the “first bend” of its 
V-I plot. At large forward currents, the emitter and base circuit resis- 
tors of Fig. 1 combine to give a terminal resistance of about 48 ohms. 
With the terminal voltage zener-limited to 10 volts, the current 
through the NIB could not exceed 0.2 ampere, whereas simulated 
lightning tests have shown that an unprotected NIB is undamaged by 
surge currents as great as 5 amperes. Lightning is not expected to 
present a serious problem. 


VI. BAND BROADENING 


Shortly after conclusion of this field experiment continuing effort 
to improve the NIB circuit revealed that by adding to it a resistor 
and a capacitor, one could flatten and substantially broaden the 
resulting transmission band, indeed achieving virtually flat lossless 
transmission almost up to the frequency of quarter-wavelength NIB 
spacing. The band-broadened equivalent circuit, shown in Fig. 12, 
is simply that of the basic unit (Fig. 2) shunted by R, and C, in series. 

The effect of the addition can be seen more readily if one first 
writes the impedance of the basic unit: 


Z4 = fig + jXa = 7 sat ee ey + jw 


T Rn 
1+ @P,) it+e@ry 
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Fig. 12 — Equivalent circuit of NIB unit with band broadening. 


When the shunt is applied, the real component R, (negative) of the 
resulting terminal impedance Zz is made larger than the real compo- 
nent R, (also negative) of Z, by what amounts to antiresonance 
between C, and the positive (inductive) imaginary component of Z,. 
Resistor R, keeps the shunt path from acquiring so low an impedance 
at any frequency as to bring instability to the “open-circuit-stable” 
basic unit. 

When the straightforward algebraic analysis used to derive expres- 
sion (7) for the basic unit is repeated for the band-broadened circuit, 
it shows that the asymptotic low-frequency image impedance (for 
G = 0 and SR = 0) has been slightly modified. With the shunt 
elements added, 

; ; RT, , Lb RP RIT, 
PEN ee Os 0) 

This expression reverts to (7) when 7, — 0 with R, > 0, or when 

R,— © with finite 7,. Computer results confirm the accuracy of (10). 

Computed transmission characteristics also support an initial esti- 
mate that the time constant 7, = R;,C, should be made roughly equal 
to T,,, and show that the revised circuit can be proportioned to sustain 
its compensation of copper resistance up to higher frequencies, while 
still letting its negative resistance fall off fast enough above the 
transmission band to preserve stability. 

The effect of band broadening upon the NIB transmission is shown 
in Figs. 18 and 14 for the case of 22-gauge BSA cable with 12,000- 
foot NIB spacing, used earlier as an example. Here both time con- 
stants (7, and T,) are made 16 ps, and curves are shown for three 
values of R,. When R, = 2,000 © the attenuation (Fig. 13) has its 
widest flat region without appreciable gain over any of the band. For 
R, = ©, the circuit reverts to the original or basic NIB. At an inter- 
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Fig. 13 — Effect of band broadening upon attenuation constant of NIB line; 
paemange BSA cable, 12,000-foot spacing. 


mediate value, R, = 6,500 Q, there is less band broadening, but the 
phase velocity (Fig. 14) becomes remarkably constant from zero 
frequency up to 6.7 kHz (at which | = 2/4). A similar change in 
slope of the phase velocity plot, shifting from positive sign for the 
basic NIB to negative for the band-broadened version, has consis- 
tently been observed over a wide variety of gauges and booster 
spacings. 


VII. PULSE FORMING 


The foregoing combination of linear variation of phase with an 
approximately parabolic variation of loss in dB, both as functions of 
frequency, clearly offers interesting possibilities in baseband pulse 
transmission. Under such a condition the line has the properties of a 
Gaussian filter. If rectangular pulses of a suitable width 7 and baud 
rate f. = 1/T are applied to it, these pulses are shaped in transmis- 
sion into the raised cosine form. As received, they have the width T 
at half their peak amplitude and 27’ along the baseline; they are 
almost free of tails. For ideal raised-cosine pulse forming, the line 
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or Gaussian filter should have a loss of 1 neper or 8.68 dB at the baud 
rate f,. Hence, for the case of R, = 6,500 ohms in Figs. 18 and 14, 
a baud rate of 8 kHz (at which the loss is about 0.635 dB per mile) 
could be sent over a line 8.68/0.635 = 13.7 miles long. Of course if 
the line were shorter, or the baud rate slower, the pulses would still 
be symmetrical and well formed, but would show flatness at their 
peaks. 

Figure 15 shows the output “eye-diagram” formed by a random 
sequence of 8-level rectangular pulses at a 16.67 kilobaud rate, sent 
~ over 10.2 miles of 22-gauge BSA cable, with 6,000-foot NIB spacing. 
Here the information rate was 3 times the baud rate, or 50 kilobits 
per second. The NIB parameters were R, = 97.3 ohms, R, = 293.9 
ohms, 7’, = T, = 6.1 X 10~ second and R, = 2,000 ohms. 


VIII. BIDIRECTIONAL TRANSMISSION 


Because of low loss in a broad transmission band, and an image 
impedance that can be well matched over that band, new possibilities 
are opened of simultaneous bidirectional carrier transmission; for 
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Fig. 14 — Effect of band broadening upon phase velocity of NIB line; 22-gauge 
BSA cable, 12,000-foot spacing. 
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example, by double-sideband amplitude modulation of the same car- 
rier frequency at each terminal of the line. Similar possibilities exist 
for bidirectional baseband pulse transmission. Both of these schemes 
have been successfully carried out in the laboratory over the same 
10.2-mile 22-gauge line with 6,000-foot spacing that was used in 
obtaining Fig. 15. 

In either case, hybrid balance separates the incoming from the 
outgoing signal. As a result of the low transmission loss, the received 
signal, if it is a modulated carrier, is left sufficiently free of out- 
going carrier (whatever its phase) to be detected without appreciable 
distortion. Similarly, if the received signal is a pulse train, it is left 
sufficiently free of interference from the outgoing pulses to be cor- 
rectly decoded or regenerated. 

Figure 16 shows two eye diagrams, received simultaneously at the 
two ends of the 10.2-mile line while two random 8-level pulse trains 
were being sent in the respective directions. Some interference may 
be seen in the interpulse intervals, resulting from imperfection of the 
hybrid balance presented to the higher frequency components of the 
rectangular input pulses. For this photograph, the pulse rate was 
raised slightly (to 16.81 kilobauds), thereby roughly centering the 
interference in the intervals between eyes of the diagram. 





Fig. 15 — Hight-level pulses received over phase-linearized NIB line, 
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Fig. 16 — Bilateral pulse transmission over phase-linearized NIB line. 


APPENDIX 


Zero-frequency Asym~ptotes of Midspacing 
Image Impedance and Phase Velocity of NIB Lines 


In this appendix derivations are given for expressions (7) and (8) 
of the text. The same method yields (10) when the NIB units include 
R, and C, as in Fig. 12. 


Terms 


Zy = Midspacing image impedance of NIB line. 

Vy = Phase velocity of NIB line. 

Z, = Total impedance of two NIB units, one on each side of balanced 
line, serving a single section. 

1 = Length of NIB section (miles). 

Zo = Characteristic impedance of nonloaded line. 

y = a+ j8 = Propagation constant of nonloaded line (per mile). 
Zoc = Open-circuit impedance of nonloaded half section (length 1/2). 
Zsc = Short-circuit impedance of nonloaded half section (length //2). 

T, = R,C, = Time constant of basic NIB unit. 
IP = l(ay + 78x) = Propagation constant of NIB line (per section). 
Z’ = lim Z, 
w—0 
V’ = lim Vy 


a0 
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Characteristic Impedance 


From well-known theory,‘ 
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Of present interest is the special case in which G = 0 and R, — R, = 
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Thus far, although F, S, and M are power series expansions, they are 
included in their entirety; nothing has been approximated, and (28) 
is therefore exact. 
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In passing to the zero-frequency limit we notice that when G = 0 


lim F = lim S = lim M = 1. (29) 


a0 a0 w0 


Accordingly, for the special case considered, 


AER nh 3 ED) gf ey | 
af mt at 96 0 
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Again from well-known theory,* 
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From (80) the zero-frequency limit of Zy, is finite, real and presumed 
positive, while from (19) that of Zoc¢ (for G = 0) is infinite, imaginary 
and negative. Hence as w — 0, 
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Computation of FM Distortion in Linear 
Networks for Bandlimited 
Periodic Signals 


By CLYDE L. RUTHROFF 


(Manuscript received December 14, 1967) 


Computations of the distortion generated in passing large-index, fre- 
quency-modulated signals through symmetrical single-pole and three-pole 
bandpass filters are presented. The computation is for a bandlimited 
periodic modulation signal; noise modulation 1s simulated by the use of 
pertodic noise samples in a Monte Carlo procedure. 

The convergence of the Monte Carlo procedure is illustrated for the 
case of the single-pole filter and the results are in good agreement with 
measurements. 

Computations of envelope distortion are also presented. These data give 
the amplitude-to-phase conversion in the receiver containing the filter to 
within a constant factor, the constant being the AM/PM conversion 
coefficient of the limiter. 


I. INTRODUCTION 


In spite of the efforts of a large number of investigators who have 
studied the problem over three decades there is no way to compute 
the distortion caused by filters and other networks for arbitrary 
angle modulated signals of large index or large baseband bandwidths. 
However, by use of the Fourier method** introduced by Roder in 
1937, it is possible to compute the exact responses of networks to a 
frequency-modulated signal for bandlimited periodic modulation 
signals. 

In addition to deterministic signals of this class, noise modulation 
can also be simulated and the resulting network distortion computed 
by a Monte Carlo procedure. In an excellent paper, Medhurst and 
Roberts* have described the procedure and given some results for 
low index FM, pre-emphasized in accordance with CCIR standards, 
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Their computer program was written in Extended Mercury Autocode. 
The same method, coded in ForTRAN 1, and extended to include the 
effects of amplitude as well as phase distortion is being used to study 
large index FM systems. 

The results presented are for single sine wave modulation and 
random noise modulation. 


Il. ANALYSIS 


The modulating signals are restricted to those which are both 
bandlimited and periodic. This class includes many signals used for 
test purposes; the notable exception is the signal consisting of band- 
limited Gaussian noise. More will be said of noise modulation later. 

The analysis and computational procedure follows that of Med- 
hurst and Roberts in Ref. 4 and is outlined briefly here. Specifically, 
the signals are those which can be written as finite Fourier series. 


N 
u(t) = >> (a, cos nw,t + b, sin nw,t) radians, (1) 


n=1 


where: 


wy = Qrf, = 22/T, 
T is the period of p(t), 


9 T/2 : 
dn = =| u(t) cos nuw,t dt, 
i TD 


9 T/2 
ee i iG sinned dk: 
T —-T/2 


If p(t) is the desired phase modulation, or p’(t) = du(t)/dt the 
frequency modulation, the angle-modulated signal is 


e = (2)? cos [wt + u(6)] (2) 


where w, is the carrier frequency in radians per second. The I'M signal 
of (2) has a line spectrum with lines at w, + Mw,, M = 1, 2,3, ---. 
The lines always occur at these frequencies, changing only in amplitude 
and phase as functions of a, , b, . It is this feature which makes possible 
a digital computer solution and, conversely, is the reason for restricting 
the form of the modulating signal to that of u() in (1). Beginning 
with (1) and (2) the major steps in the analysis are: 


(¢) Derive the line spectrum of (2). 
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(it) Modify the lines in amplitude and phase in accordance with 
the response of the network being studied. 

(iit) Derive the envelope and phase of the modified line spectrum, 
that is, determine /(é) and 6(¢) where the output of the network 
is written 


eo = E(t) cos [wt + 4(4)] (3) 
(iv) Derive the line spectrum of H(t), @(é), and dé/dt. 


II. RANDOM MODULATION 


An important measuring method in widespread use on FM systems 
is the noise loading test. The importance of this method arises from 
the fact that a band of thermal noise is a good approximation to a 
frequency division multiplex signal which consists of a number of 
voice channels. In this test a band of thermal noise in the frequency 
range 0-W Hz is the baseband signal. The noise is removed by band 
rejection filters in one or more narrow bands or slots ahead of the 
modulator. At the receiver the power density appearing in the slots 
is a measure of the intermodulation distortion in the system. The 
results are usually given in the form of a signal-to-distortion ratio, 
the signal being the power density at the slot frequency when the 
band rejection filter is removed, that is, when the signal is present. 

Computations of distortion can be made along these lines by fol- 
lowing a Monte Carlo procedure with a sequence of random noise 
samples generated from the periodic form of (1). A set of N sine 
waves of equal amplitudes and random phases distributed uniformly 
in the interval 0 — 27 constitutes the basic signal. Figure 1 is an 
example of this random noise sample for N = 10 and Fig. 2 for 
N = 50. One or more amplitudes are set to zero to form the slots, 
and the power in the slots as a result of network distortion is com- 
puted as outlined in Section II. The process is repeated with a se- 
quence of random noise samples, each sample with a set of N inde- 
pendent random phases. The distortion is averaged for the final 
result. If N is large enough, if the number of sets is large enough, 
and if the network transfer function is well-behaved, then the results 
approach those obtained in a noise loading test. 

Rice’ has shown that such a noise representation has a normal ampli- 
tude distribution as N — © and w, — 0. Bennett® has computed the 
amplitude distribution as a function of N. The conclusion is that with 
respect to amplitude distribution the sets of random signals of the 
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Fig. 1—A periodic random noise sample for N = 10. Peak amplitude/rms 
amplitude = | —0.525 |/[1/(2N))*7] = 2.34. 


form (1) approximate Gaussian noise. With respect to the spectrum 
the situation is otherwise; the spectrum of noise is continuous whereas 
the simulation, for finite N, has a line spectrum. This means that the 
results computed with the simulated noise will approximate the results 
for real noise only for network responses which are smooth enough. 
An example of a function which is not smooth enough is a network 
response of unity at the spectral lines and zero elsewhere. In spite of 
this limitation it is not expected that smoothness will be a serious prob- 
lem for most cases of interest. 


3.1 Modulation Index 


The modulating signal u(t) can be written as follows: 


N 
u() = >> A, cos (nwt + a,) radians, (4) 
n=1 
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where, 
An = a, +0. 
a, = —tan”™’ Dn 
an 
The baseband is 
W = Na,.- (5) 


Using (4) to simulate noise in a phase modulation system, the ampli- 
tudes A, are equal and the random phases a, are uniformly distributed 
from 0 to 27. If the rms phase deviation is ¢ radians, 


A, = ¢(2/N)* radians. (6) 


For the FM application the amplitude terms of the frequency modula- 
tion y(t) are made equal to simulate a flat band of noise, that is, 
nw,A, = A, the peak frequency deviation per sine wave. The mean 
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Fig. 2—A periodic random noise sample for N = 50. Peak amplitude/rms 
anpinde = = | 0.29 |/[1/(2N)*7] = 2.90. 
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square frequency deviation is 


A? nw. A? 
2 os 3S = atin 
o 5 N 5 





xX N. 


Substituting for w, from (5) we get 
A, = (¢/W)[2N)*]/n. (7) 
The rms phase and frequency deviations can be related to the RF 
bandwidth by Carson’s rule which, for noise modulation, is written 
B= 2W(1 + 40c/W), (8) 


where the peak frequency deviation is assumed to be 4c. Suppose 
that the line spectrum of (2) contains kN lines in addition to the 
carricr, then the bandwidth of the computed spectrum is 


B = kNo,. (9) 
From (5), (8), and (9) we get the relation between k and o 
k = 211 + 40/W). (10) 


This equation is as accurate as Carson’s rule and is useful for estimating 
k when o/W is given. If k is chosen too small, significant spectral com- 
ponents are omitted from the spectrum; the effect is to pass the com- 
plete spectrum through an ideal filter of bandwidth kNw, . 

In a similar manner k and ¢ can be related for the phase modula- 
tion case. The rms frequency deviation for the PM case is given by 


3 1 
L4+s0 +55 
2N  2N 
2 ae aaa (11) 


=I9 


where WN is the number of tones in the baseband. Substitution of (11) 
into (10) gives the desired result. 


3.2 Limitations on Modulation Index 


It has been shown (9), that the maximum RF spectrum bandwidth 
is given by B = kNw, . From (5) the baseband bandwidth is W = Nw, . 
Assuming that only negligible energy falls outside B, then B is the RF 
bandwidth and the parameter k is a bandwidth expansion factor since 


k = B/W. (12) 


Now, & and the rms frequency deviation o are related by (10). The 
product kN is limited by the high speed storage capacity of the machine; 
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this implies a relationship between N and c/W. Let M 2 KN be the 
maximum value of kN which can be accommodated in the machine. 
Then, 


o/W < 1/4(M/2N — 1). (13) 


This expression is dependent upon Carson’s rule and has the same 
unknown precision—but it serves to demonstrate the point that if 
large o/W is desired, N must be made small. In the work reported 
here, J = 500 so that for N = 10, ¢o/W S 6. Conversely for N = 100, 
a/W S 0.375. 

Because Carson’s rule has an unknown precision it is necessary to 
determine to reasonable accuracy the relationship between k and o/W. 
With a perfect rectangular filter of bandwidth kNw, , signal-to-dis- 
tortion ratios have been computed for the case N = 10. In these com- 
putations, slots 1 and 10 were set to zero separately and the SDR 
computed for that slot. 

The results are shown in Fig. 3 as a function of ¢/W with the band- 
width expansion ratio k as a parameter. In all cases slot 1 has the lowest 
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Fig. 3—FM signal-to-distortion ratios for square filters containing kN+1 
spectral lines and with N = 10. 
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SDR. The levelling off for SDR near 124 dB is probably caused by the 
computer round-off error. The negative slopes are the result of the 
finite filter bandwidth of kNw, and the decreasing accuracy of the 
method of harmonic interpolation in approximating the spectrum. 
Increasing k improves the accuracy of the approximation. 

Values of s/W obtained from Carson’s rule in the form given in 
(10) are shown by the arrows in Fig. 3. Fig. 3 can be used to determine 
the value of k required to compute the SDR for a given o/W. In all 
examples reported here, k and N have been chosen so that without a 
filter an SDR = 100 dB was obtained for the values of ¢/W used. The 
data of Fig. 3 are averages of 20 noise samples. 


IV. THE SINGLE POLE FILTER 


The single-pole filter is the simplest possible realizable bandpass 
filter and is important for two reasons. 


(t) It is widely used. For example, it is nearly optimum for use 
in the IF section of a frequency feedback receiver.’ 

(iz) As simple as it is, no previous method is adequate for the com- 
putation of FM distortion for high frequencies and large deviations. 


4.1 Single Sine Wave Modulation 


A number of years ago Bodtmann*® made extensive measurements 
on a single-pole filter with both single sine wave and noise modula- 
tion.* Let us compare the measured and computed results. 

The transfer function of a narrow band single-pole filter is 

YS (14) 


1+;5+ 


where: 


f. is the center frequency and | 
f, is the half bandwidth, that is, the frequencies at which the response 
is down 3 dB are f, + f, . 


Bodtmann’s filter was centered near 70 MHz with a half bandwidth 
of 1.223 MHz. The skirts fit the response of (14) to within +0.1 dB 
out to the 15 dB loss points. The measured and computed ratios of 
signal-to-third harmonic distortion power are shown in Fig. 4. Notice 


*It was Bodtmann’s results which led to the discovery of a simple error in 
existing theories.9-11 
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Fig. 4 — Third harmonic distortion in single pole filter. 


the peculiarity which occurs at a deviation of 1.2 MHz where the curves 
for 360 KHz and 1 MHz modulation frequencies cross. Existing theories 
do not predict this behavior which is verified here by direct computation. 


4.2 Results for Random Modulation 


Computations of SDR have been made for a single pole filter for 
the random modulation discussed in Section III. The results, for 
noise samples of 10 and 50 sine waves of equal amplitude and random 
phase, are shown in Fig. 5 with Bodtmann’s measured results. The 
computations followed the Monte Carlo procedure described pre- 
viously. The data in Fig. 5 for N = 50 is the average over two slots 
at each frequency for 50 noise samples. The pairs of slots are 4 and 
5, 17 and 19, and 49 and 50, corresponding to the slot frequencies 
84 KHz, 360 KHz and 1 MHz, respectively. Data for all the slots 
were computed in the same computer run. In the computations for 
N = 10 one slot at a time was computed, each point being the average 
of 80 noise samples. 

When the noise sample is simulated by 50 sine waves, the agree- 
ment with the experimental data is good. The SDR’s for the case of 
10 sine waves per noise sample are somewhat higher reflecting the 
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Fig. 5 — Bodtmann’s measured results compared with noise samples. 


fact that larger modulation peaks are to be found in the sample 
with the larger number of sine waves.°® 
4.3 Convergence of the Monte Carlo Process 


The SDR’s of 80 individual noise samples for N = 10 are shown 
in Fig. 6 in four sets of 20 each. The average SDR as a function of 
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Fig. 6— FM SDR in a single pole filter. Ten sine waves in baseband; SDR 
computed in slot 4; bandwidth expansion factor k = 10; o = 0.2 MHz; #./W = 
1.223. 
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the number of noise samples is shown in Fig. 7; the four sets of Fig. 
6 are averaged in sequence. It is interesting to ask how close to the 
80-sample average one would get if only 20 samples were used. As a 
partial answer, the four sets of Fig. 6 were averaged separately and 
the results are shown in Fig. 8. All four 20-sample averages fall 
within 1 dB of the 80-sample average. 

Similar data for slot 19 is presented for the case N = 50 in Figs. 
9, 10, and 11. Slot 17 was also computed and the averages for both 
slots are shown in Figs. 12 and 13. The results for slots 17 + 19 are 
remarkably similar to those of 19 alone. The 10-sample averages 
deviate from the 50 sample average by a maximum of 2.7 dB for 
slot 19 and 2.3 dB for the sum of slots 17 + 19. Interestingly enough, 
the 10-sample average for N = 10 deviates from the 80-sample 
average by a maximum of 2.2 dB. 

The behavior of the SDR of a single noise sample as a function 
of «/W is also of interest. Fig. 14 shows this behavior for each of the 
first six noise samples of set 1, Fig. 6, compared with the 80-sample 
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Fig. 7 — Fluctuations in SDR of single. ‘pole filter as a function of number of 
sets of computations. Ten sine waves in baseband; SDR oes in slot 4; 
bandwidth expansion factor k = 10; o = 02 MHz; w./W = 1.223. 
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Fig. 8— Fluctuations in SDR of single pole filter as a function of number of 
sets of computations. Ten sine waves in baseband; SDR computer in slot 4; 
bandwidth expansion factor k = 10; ¢ = 02 MHz; w-/W = 1.223. 
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Fig. 9—FM SDR in a single pole filter. 50 sine waves in baseband; SDR 
computed in slot 19; bandwidth expansion factor k = 10; « = 0.2 MHz; 
wo/W ~— 1.223. 
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Fig. 10 — Fluctuations in SDR of single pole filter as a function of number 
of sets of computations. 50 sine waves in baseband; SDR computed in slot 19; 
bandwidth expansion factor k = 10; ¢ = 0.2 MHz; w./W = 1.223. 
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Fig. 11— Fluctuations in SDR of single pole filter as a function of number 
of sets of computations. 50 sine waves in baseband; SDR computed in slot 19; 
bandwidth expansion factor k = 10; ¢ = 02 MHz; w./W = 1.223. 
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Fig. 12— Fluctuations in SDR of single pole filter as a function of number 
of sets of computations. 50 sine waves in baseband; SDR computed in slots 
17 + 19; bandwidth expansion factor k = 10; ¢ = 0.2 MHz; w./W = 1.223, 
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Fig. 13— Fluctuations in SDR of single pole filter as a function of number 
of sets of computations. 50 sine waves in baseband; SDR computed in slots 
17 +- 19; bandwidth expansion factor k = 10; o = 0.2 MHz; w./W = 1.223. 
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Fig. 14— Behavior of the SDR in a single noise sample as a function of 
o/W. Single pole filter; slot 4; N = 10; sample set 1. 


average. The same behavior has been observed for other filters. It 
is clear that almost any noise sample will predict the SDR behavior 
as a function of o/W, but the actual SDR computed for the single 
noise sample depends on the peakiness of the sample. 


V. THE THREE-POLE MAXIMALLY FLAT AMPLITUDE FILTER 


The maximally flat amplitude filter is used widely in frequency 
modulation systems; it has the flattest possible amplitude response 
near the midband frequency and is often used in conjunction with a 
phase equalizer. The transfer function of a narrow band three-pole 
bandpass filter is 


y= : : (15) 


1 = n(54Y + (54/0, - (G4) ] 


f. is the midband frequency and 

f. is the filter half bandwidth; that is, the frequencies at which the 
response is down 3 dB are f, + f, , 

b, , b2 are both equal to 2 for an MFA filter. 














where 
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Fig. 15— FM signal-to-distortion ratios in a three-pole MFA filter. W = 7 
MHz; 3 dB filter bandwidth = 288 MHz; N = 10; k = 50; no carrier offsct. 


SDR computations for an unequalized filter are presented in Fig. 15 
as a function of frequency deviation. The dashed lines are 12 dB 
per octave slopes placed arbitrarily to coincide with the data at 
o/W = 2. The data points are 20-sample averages. The large cross 
is the SDR in slot 10 of a three pole 0.1 dB ripple Chebyshev filter 
with the same skirt selectivity as the MFA filter at a frequency 256 
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Fig. 16— FM signal-to-distortion ratios in a three-pole MFA filter. ¢6/WV = 
ae 3 dB filter bandwidth = 2388 MHz; N = 10; k = 30; slot 10; no carrier 
oliset, 
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MUz from the carrier. The Chebyshev filter is clearly superior to the 
MEA filter in this instance. The SDR is a function of baseband W as 
shown in Fig. 16 for o/W = 3.12 and slot 10. An arbitrary slope of 
18 dB per octave is included. As in Fig. 15, the data points are 
20-sample averages. 

Fig. 17 shows the effect of a carrier frequency offset with respect 
to the filter midband frequency. In the application for which this 
filter was chosen, the midband frequency change over the ambient 
temperature range —40°F to +140°F is about +6 MHz. 

Results for perfect phase equalization are shown in Fig. 18; arbi- 
trary slopes have been added. It is clear that nearly all of the dis- 
tortion in the unequalized filter is due to nonlinear phase. 


VI. AMPLITUDE TO PHASE CONVERSION 


In addition to the FM distortion in the filter output there is gen- 
erally some envelope distortion. Since all known limiters convert 
envelope modulation to phase modulation this source of distortion 
must be accounted for in system design. The envelope distortion is 
computed as described in Section II and it is necessary to relate 
it to the AM/PM conversion of the limiter. 

For good limiters the AM/PM conversion is small and can be 
assumed linear, that is, 


6 = Qm (16) 
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Fig. 17—-FM SDR in three-pole MFA filter as a function of carrier offset. 
W =7 MHz; 3 dB filter bandwidth = 238 MHz; N = 10; k = 50; o/W = 3112. 
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where 


m is the index of amplitude modulation for the slot of interest, 
6is the phase shift in radians in the same slot caused by m, and 
Q is the AM/PM conversion coefficient. 


The normal signal in the slot of interest is a sine wave of amplitude 
A. The signal-to-AM/PM distortion ratio is given by 


SDR (AM) = 20 log A/@ 
20 log A/Qm 
= 20 log A/m — 20 log Q. (17) 
The first term, 20 log A/m, can be computed for the network and 
the AM/PM conversion coefficient can be included separately. 
The AM and FM SDR’s for transitional Butterworth-Thomson 
filters?? are plotted in Fig. 19. For the Chebyshev filter the AM and 


FM SDR are 72.3 and 66.3 dB, respectively. All filters were adjusted 
for equal loss 256 MHz from the midband frequency. The trends are 
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Fig. 18— FM SDR in a phase-equalized three-pole. MFA filter. VW = 7 MHz; 
3 dB bandwidth = 238 MHz; N = 10; k = 50; slot 10. 
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Fig. 19—FM and AM SDR in three-pole transitional Butterworth-Thomson 
filters. W = 7 MHz; loss 256 MHz from midband = 20 db; N = 10; k = 30; 
no carrier offset; o/W = = 38.12; slot 10. 


as expected, as the filter goes from MFA to maximally flat envelope 
delay (MFED) the FM distortion decreases and the AM/PM dis- 
tortion increases. The effect of the limiter AM/PM conversion coef- 
ficient can be included by adding —20 log Q to the curve marked AM. 

The frequency responses for the filters are given by (15); for the 
0.1 dB ripple Chebyshev filter b; = 1.921, bs = 1.801. For the tran- 
sitional Butterworth-Thomson filters the parameters are: 


Filter No. | 1-MFA 2 3 4 5 |6-MFED| 7 
b, 2.0 2.103 | 2.201 | 2.294 | 2.3883 | 2.466 2.547 
be 2.0 2.092 | 2.182 | 2.268 | 2.352 | 2.433 2.510 


VII. DISCUSSION 


The Fourier method for the computation of FM distortion in linear 
networks has been described and some results presented for single 
sine wave modulation and for random noise modulation simulated 
by groups of harmonically related sine waves. The method is exact 
to an accuracy determined by the round-off error in the machine. 
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Although the computation is exact for any individual input signal, 
the results for noise modulation are only approximate because the 
results depend upon averaging over a finite number of periodic noise 
samples. Much of the work described in this paper has been devoted 
to describing the behavior of the noise computations and in the 
determination of the maximum modulation index for which computa- 
tions can be made with suitable accuracy. 

In addition to demonstrating the nature of convergence of the noise 
averaging method, a detailed comparison of this method with the 
experimental results of W. F. Bodtmann provides an excellent demon- 
stration of the extent to which a noise sample consisting of as few as 
10 sine waves approximates a thermal noise signal. The noise simulation 
with a 10 sine wave noise sample is sufficient for most applications and 
accurate computations have been made for modulation indexes of 
oa S 6W where g is the rms frequency deviation and W is the bandwidth 
of the modulating signal. 

It is notable that a single periodic noise sample is sufficient to 
determine the shape of the curve describing the signal-to-distortion 
ratio as a function of the deviation, the baseband bandwidth, or the 
filter parameters. This result, illustrated in Fig. 14, can be used to 
conserve computational time when optimizing the parameters of a 
system. 
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Linear-Real Codes and Coders* 


By WILLIAM H. PIERCE 
(Manuscript received August 22, 1967) 


In linear-real coding, the transmitted signals are (possibly redundant) 
linear combinations of the data signals. The linear combination of data 
signals can have a block pattern, resulting in linear-real block coders, or 
a stationary pattern, resulting in linear-real stationary (shift-register) 
coders. Stationary coding is shown to be a limiting case of block coding. 
Both methods appear to be practical for the control of burst and wmpulse 
noise. However, stationary coding appears to have some advantages and 
ts the only one we study here. We propose shift register 1mplementations 
which promise the required precision and dispersion at less cost than 
tuned RLC circuits. 

Error properties of both block and stationary coders are similar, but 
at is easier to learn concepts by analyzing the block coders. When the receiver 
zs able, by using some of the techniques we discuss, to estimate the noise 
covariance matrix for each codeblock, the resulting noise power ts less than 
that for receivers not using the statistics for each codeblock. 

Nonlinear memoryless filters, such as clippers, are especially effective 
when used with linear-real coders. We propose a memoryless filter which 
attenuates the input signal more severely when a second input to the filter 
indicates the channel is having a noise burst. If the memoryless filter 7s 
designed for the worst case noise, then performance will not degrade with 
decreased noise when the nonlinearity is odd and monotonic. 


I. INTRODUCTION 


Many communications channels, including telephone channels, con- 
tain noise which comes in short bursts, such as noise from impulses. 
Such noise is particularly deleterious when the channel is used for 
the transmission of digital data. . 


* Part of the research for this article was performed at Carnegie Institute of 
Technology under National Science Foundation grants GP-39 and GK-373. 
Some of the material contained in this paper is taken from the author’s con- 
vention article,® 
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At least as early as 1958 it was discovered that it is sometimes 
possible to reduce digital errors in such channels without reducing the 
noise power by using a scheme such as Fig. 1 shows. In some formu- 
lations?~? the transformation A consisted of a continuous all-pass filter 
whose Fourier transform magnitude was unity at all frequencies but 
whose phase characteristic varied with frequency; the inverse linear 
transformation was the continuous all-pass filter with the conjugate 
phase characteristic. The linear filter was called the smear operation, 
and its inverse the desmear operation. Later papers considered linear 
transformations to be real-number matrices operating upon the data 
in blocks.**° 

In all schemes to which Fig. 1 applies, a single impulse of noise 
into the inverse linear filter will be transformed into an output noise 
which is dispersed in time. With proper design, this dispersed noise will 
be small enough at all times to not produce errors at the output of 
the quantizer. 

Our purpose is to investigate coding schemes which fall in the 
general pattern of Figure 1 to gain conceptual insight and learn practi- 
cal design. Such study is useful because the practicality of the matrix 
version has never been studied, and the continuous all-pass filter was 
limited by cost and filter imprecision. The shift registers we might 
propose avoid the problems which hindered the application of contin- 
uous all-pass filters. 

We show that the real-number linearity of the transformations of 
Fig. 1 will permit the receiver to use any available information about 
noise correlation or position. All of the proposed means for using this 
information are simple in concept, and some are simple to implement. 


If. DESCRIPTION 


Linear-real block coding is a form of coding in which A, an n by k 
matrix of real numbers, is used to produce an output vector b from an 
input vector r according to the equation 


b = Ar. (1) 


ANALOG 
OUTPUT 


























DIGITAL LINEAR DIGITAL 
SOURCE TRANS-— QUANTIZER OUTPUT 
FORMATION 
A 


Fig. 1— A general arrangement for placing linear filters A and A“ to reduce 
digital errors, 
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If n > k, then b will be redundant in the sense that not all of its com- 
ponents are independent. The word “real’’ is used in order to emphasize 
the fact that the arithmetic in equation 1, and all other equations in 
this paper, is real number arithmetic. The use of real number arithmetic 
distinguishes this work from generalized parity-check coders which are 
linear in finite-field arithmetic. 

Stationary (shift register) linear-real coding is a limiting case of 
linear-real block coding, but is best described as being the convolution 
summation given by 


b; = a hi"; (2) 
where 6; is the 2 signal transmitted, 7; is the it data number, and 
where h, can naturally be called the unit pulse response of the en- 
coding filter at time-step q. 

The conclusions to be reached on practical applications are that 
moderate cost encoders and decoders of considerable use for burst and 
impulse noise channels can be built as soon as low-cost tapped digital 
delay lines are available. Magnetic domain-wall digital delay lines, 
for example, might well make these coders practical. 

There are two general ways in which noise is controlled by means 
of linear-real coding. We give the complete details and mathematics 
later. Briefly, the qualitative aspects are: 


The total noise power in the decoded signal is made less than that 
without coding. We discuss three distinct ways of doing this: 


(t) When linear-real block coding is used, and when the noise 
covariance matrix is known (or can be adaptively deduced by the 
receiver) then this knowledge can be used to reduce the noise power. 
It can be correlation type knowledge, as accounts for the effectiveness 
of Wiener filtering. If the noise process is a posterior: nonstationary, 
then a receiver which estimates the noise correlation matrix for each 
code block may effectively use the available information on the po- 
sition of burst noises within the block. This is particularly effective in 
burst noise channels having block coders using rectangular A matrices. 

(wz) A stationary memoryless nonlinear filter (such as a clipper) can 
be used to reduce the noise power before the inverse linear transforma- 
tion is applied. Such a filter would of course reduce noise power in 
the absence of an inverse filter when it immediately precedes the 
quantizer, but it would not then reduce errors. When placed before 
the inverse transformation, the stationary memoryless nonlinear filter 
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reduces both errors and noise power. We refer to equations for analyz- 
ing design and performance of the memoryless nonlinear filter. A simu- 
lation example in Section VI shows these devices to be surprisingly 
effective. 

(22) A memoryless nonlinear filter can be used which has both the 
noisy signal and an estimate of the instantaneous noise power for 
inputs. The output is an optimized estimate of the signal given the 
estimated instanteous noise power. This filter always reduces noise 
power, as does the filter in method 2, and only reduces errors if 
there is a filter such as the inverse linear transformation between it 
and the quantizer. We describe several methods for estimating the in- 
stantaneous noise power in Section V. One of these, which appears in 
Fig. 6, uses the fact that practical pam signals have more bandwidth 
then the Nyquist bandwidth for their pulse interval. 


The remaining noise power is distributed more evenly among all 
decoded signal components and (in the limit of infinite smearing) 
made Gaussian. This type of noise control is especially effective in 
quantized-signal burst and impulse noise channels which have a 
thermal noise which is small compared with the separation between 
quantization levels. In this case a burst noise with power which is 
small compared with the thermal noise would be unable to produce 
many errors if it were evenly dispersed, although it could when 
bunched up. Dispersal of the burst noise power is sometimes un- 
favorable, but if the noise power is reduced enough and the noise 
dispersed enough, then the effect is very favorable. The decoding op- 
eration also tends to make the decoded signal have a Gaussian first- 
order probability distribution, which reduces the probability of a 
large peak and thereby reduces errors for quantized signals. 

The design equations for the nonlinear memoryless filler (clipper) 
to which we refer assume a known probability distribution on the 
noise, as does the simulation reported. In practice, the actual noise 
can be less noisy than that used for design purposes, and the resulting 
mean square error will not be larger than that with the design noise, 
provided the noise probability density is even and the nonlinearity 
has certain properties. We give precise details in Appendix D. 


III. BLOCK CODES AND THEIR NOISE COVARIANCE MATRIX 


In general, assuming r and c are independent zero-mean column 
vector random variables, which represent the signal to be encoded and 
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the channel noise, respectively, and assuming r and c have nonsingular 
covariance matrixes Q and N, respectively, and assuming that f = 
b + c is decoded by some linear operator 7’, where b is given in equation 


1, then a straightforward evaluation of the covariance matrix of u = 
r — Tf will show that 


M = E{uu'‘] 
— (Laxey = TA)QT cx) van TA)’ =F TNT" (3) 


where ( )! denotes the transpose of a matrix or column vector. This 
formula can be used to compare the performance of encoder-decoder 
pairs with good and bad choices for matrix A, and good and bad 
choices of matrix 7. 

Table I shows three possible 7’ matrices. The first was shown to be 
the least mean square linear estimator in (9), and for Gaussian 
signal and noise gives the conditional mean of the transmitted vector 
given the received vector. The second is the first evaluated for infinite 
signal power in all degrees of freedom (which implies Q*+ = 0) 
and produces a decoded error uncorrelated with the signal. The third 
does not require the use of the N matrix. All assume the columns of 
A to be linearly independent. 

Table II gives further insights into the behavior of the decoded error 
by presenting a number of special cases of equation (3). The justifica- 
tion of the equations of Table II is given in Appendix A. In one of 
the special cases in Table IJ, namely when equation (7) applies, the 
decoded noise energy is proportional to the arithmetic mean of the 
received noise energy. In other cases, such as that of equation. (12), 
the eigenvalues of A‘N-A play a crucial role in formulas for the mean 
square decoded noise. 

Equation (13) of Table II shows that the average of the eigen- 
values of A'N-*A appears in a formula for a lower bound for the mean 


TaBLeE [— Tsree DirreRent LINEAR OPERATORS FOR 
Decopine f INTo r. 








Name Formula 
Mean estimator T=(Q?+ AtN-A)7AtN> 
(Gives least mean square error) 
Unattenuated estimator T= (AtN-1A)1A'N 
Unadaptive estimator T = (AtA)"143 


(The generalized inverse of A) 
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TasBLe II—Somer SpPEcIAL CASES OF THE ERROR COVARIANCE 
Marrix oF EquaTIoNn (3) AND THE RESULTING MEAN SQuARE ERROR 





Unadaptive Estimator 


M = (AtA)1AtNAL(AtA) “14, (4) 
m.s. error = 1/k tr (At*A)71A‘NA[(At*A)7]é, (5) 

When the columns of A are orthogonal and each of length (n/k)}: 
M = (k/n)?AtNA. (6) 


When in addition N = diag (m1, nz, «++, mn), and AM is the arithmetic mean of 
these n,’s, and A is 1/(k)? times the first & columns of a Hadamard matrix (see 
Appendix A for a definition): 


m.s. error = M;; = (k/n)AM. (7) 
Unattenuated Estimator 
M = (A'N“A)71 (8) 
m.s, error = 1/k tr (A*N—1A)7. (9) 
Mean Estimator 
M = (Q7 + A'tN-1A4)7), (10) 
m.s. error = 1/k tr (Q7) + At*N71A)7, (11) 
Mean Estimator (Q = 1) or Unattenuated Estimator (Q = 0) 


ke 1 
=> a 9 
m.s. error = 1/k x 1(00-7 + ANA) (12) 
where A,(Z) denotes the 7“ unordered eigenvalue of Z. Special case of above when 
Q = sI, s scalar: 
s. error = 1/k 3 pee, econ > eee see ee (13) 
= OP 4 Os FAA) = - 
Qs! + 1/k ¥ A AtN1A) 
i=l 


Special case of equation (12) when Q = s/J, and A is square, orthogonal, and each 
column has length (n/k)?: 


wort 


k 
m.s. error = 1/k }° ae (14) 
Fh goa 1 
rxACN) 
The following assumptions are referred to as equation (15): 
Q = 1: T is the mean estimator. 
2 = 0: T is the unattenuated estimator, and A‘N-1A is positive definite. 


Q = sI, s scalar. 

A is 1/(k)* times the first & columns of an n X n Hadamard matrix. 

N = diag (m1, M2, +°* Mn)- 

The n; variables are independent, identically distributed random variables such 
that re 1/n;) exists, has finite variance o?, and the harmonic mean of the n; 
variables 


HM = Ea a um | (15) 


is finite. 
k is large enough for the weak law of large numbers to apply. 
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Assuming equation (15): 


—_—__ < m.s. error (16) 
—1 an arte Sees 
Os” EM 
m.s. error < 0.5 ——— 0.5 —— (17) 
2 n o2n - n ee 
°" + FHM a ee Nae 


provided that the first denominator is positive, where 7 is given by equation 35 
of Appendix I. 





square error; furthermore, that the mean square error equals this 
lower bound only when all the eigenvalues are the same. Thus the 
deviations of the eigenvalues of A*N-1A detemine the closeness of the 
lower bound of equation (13), which Appendix A shows is sometimes 
related to the harmonic mean of the eigenvalues of N, which appears 
in equations (16) and (17). 

A geometric illustration of the eigenvalues of A‘N-+A for rectangu- 
lar A with orthonormal columns begins with the observation that the 
eigenvectors of N-1 form the semiaxes of an n-dimensional ellipsoid. 
The projection of this ellipsoid by the transformation A’ forms an- 
other ellipsoid, which will be called the k-dimensional shadow of the 
original n-dimensional ellipsoid.* 

The semiaxes of the shadow ellipsoid have the lengths of the eigen- 
values of A‘N-+A. In order for the equation (13) bound to be close 
to the actual value, the semiaxes of the shadow ellipsoid have to be 
generally near their mean length; in other words, the shadow has to 
be round. A sufficient condition for the shadow to be round is that 
the ellipsoid is the shadow of a round ellipsoid, but this is not neces- 
sary. For some of the possible spacial orientations, for example, a 
football’s shadow is rounder than the football. 


IV. THE LIMITING CASE OF STATIONARY (SHIFT REGISTER ) CODING 


The purpose of this section is to show that—in the limit— all linear- 
real coding and decoding operations can become time stationary, so 
that they can be implemented by shift registers with time-invariant 
impulse responses. The limit is taken in the sense that the transmitted 
digits are obtained as a single block code whose output is a column 

* An ordinary planar shadow of a three-dimensional object will be an orthogo- 


nal projection only when the light rays are parallel, and are normal to the 
plane of the shadow. 
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vector with components from —n to n, where n approaches infinity. 
There are two reasons why a study taking linear-real coding to the 
limit of being time stationary can be advantageous or useful: 


(t) Stationary encoders and decoders appear to be more economical 
to implement than the block type of encoders and decoders. 

(72) The mathematical investigations to be made in the passage to 
the limit will add insights to linear-real coding by showing that a 
special case of it is Wiener filtering, and will add insights to Wiener 
filtering by showing that a Wiener filter is related to the least mean 
square estimator of matrix-encoded noise data vectors. 

Toeplitz matrices, defined later, and Z-transforms (Ragazzini and 
Franklin) 72 are our main mathematical techniques to reach these 
ends. 


4.1 Stationary Coders 

The transmitted signal b; is assumed to be obtained from the data 
stream r; by the convolution summation of equation (2), which can 
be put in matrix form by means of the doubly infinite vectors 


By Pox 
b=/ bl, PSS [ty ls etc., 


b, ry 


and the Toeplitz matrix (defined in section 4.2) 
Ay = Gn; = fizg 
so that equation (2) can be expressed in matrix form by 
b = Ar. 
The problem of how to perform the infinite matrix multiplications, 


either analytically or with hardware, will be shown to be solvable by 
the use of Z-transforms. 


4.2 Infinite Toeplitz Matrices - 
An infinite matrix A, with elements A,;, 7,7 = 0, 41, +2, ---, 
will be called Toeplitz* if some Sequence ..., @1,%,,... exists 


* Hermitian matrices of the type of Equation (14) are called Toeplitz forms, 
and are described by Grenander and Szego.18 The Hermitian property is not 
assumed in this paper’s definition, since it is not needed for some of the results. 
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such that 
Aig = GG-1 (18) 


for all i, 7. Associated with this Toeplitz matrix will be the two- 
sided Z-transform 


az) = >> ag’. (19) 
The convergence properties of Toeplitz matrices could prove trouble- 
some in some cases, but in this paper most difficulties will be avoided by 
using only those matrices whose associated Z-transform, according to 
equations (18) and (19), has all its poles some finite distance from the 
circle | z | = 1, and which is absolutely convergent on |z| = 1. (If the 
matrix is to be inverted, it also must have its zeros some finite distance 
from |z| = 1.) 

Any poles outside | z| = 1 arise from a, sequences which are nonzero 
for g < 0. This should not cause alarm, as noncausality of unit pulse 
reponses for decoders is not a serious practical obstacle, since actual 
noncausal unit pulse responses can be arbitrarily well approximated 
by accepting a decoding delay. These restrictions on the poles of the 
associated Z-transforms require that a, be bounded by a geometrically 
decreasing sequence as g > +. 


Section B.1 of Appendix B presents theorems which are useful in 
relating. Toeplitz matrix operations to Z-transforms, and shows how 
least mean square matrix operators of the Toeplitz type can be related 
to Weiner-filter types of sampled data estimators. 


4.3 Error Analysis 


When A, Q, and N are Toeplitz and nonsingular, the expressions for 
the mean square error equivalent to the equations of Table 2 are 


Turan = (Q"* -p AN TA) AN? 


or 
1 
a@a(*) 
tuean(2) = see, TEN ae Oe 
nia) + a(4)a@)a 
gives 
ms. _ diagonal component | (20) 
error of Muraw = Q + AN *A)™* 
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or 


ms. _ 7-1 aint io 21) 
error |n@ + a(4)a@)ae| i 
where Z~’ is the inverse Z-transform integral operator. When A is either 


finite or Toeplitz but nonsingular, Tuxarrenvaren 20d Tyunapartive 
give the same decoding matrix, namely A™~’, which will be called 


Tinver SE - 
-1 
TInvERSE =A 


or for the Toeplitz case 


tinverse(@) = a(z) 


gives 
m.S. _ ee diagonal a (22) 
error of A-*N(A™’)' 
ms. _ Pall nz) . (23) 


error |acya(4) j k=0 


The above error can be evaluated by these three methods: 


(t) Truncate A and N and then compute an oeneee component 
of (A‘N~*A)~* near the center of the matrix. 

(ii) Use Z-transforms to find tyunarrenvatep(2). Invert the Z-trans- 
form by either 

(a) Using the inversion integral for Z-transforms, or 

(b) Using pole-zero expansions and a small table of Z-transforms. 


Method (z7-a) is the Z-transform analog of using Parseval’s theorem to 
find mean square errors of stationary nonsampled systems. 


Lemma 1: When A is Toeplitz with columns orthogonal and of length 1, 
then 


(a) AA=I 


(b) a(1)ac) a 


The proof is trivial. Also notice that (a) = (0). 
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Corollary 1: When T = A~* and A 1s orthogonal, 


m.8. . 
= M.S. NOE. 
error 


For design purposes it is desirable to make the following definitions; 
both assume 7’ = A™ which is assumed to exist. 


For Toeplitz A and N: 








Bios i diagonal ea diagonal aaa 
power _ Lof A™'N(A™)' of A’A 
amplification [on diagonal component of N] 
(24) 
For Block Coders: 
. 1 -] -1\t 1 t 
noise —tr ACN(A™)' l=tr AA 
k k 
power = (25) 
amplification E tr | 


Physically, this corresponds to the actual amplification of noise in a 
channel which encodes with a matrix proportional to the A matrix, 
where the proportionality constant is selected to make the encoder 
give unity power amplification to a white signal, and where the decoder 
is Tinverse-. For the stationary coder and channel, the Z-transform 


version is: 
[ae 


noise Jaea(£)/ Neat z'\atea +) k=0 


power = :; : (26) 
amplification 7 








The block code version of the trace formula can also be used to show 
that if the impulse response of the stationary encoder is... @_, , @ , @ , 

. , and its inverse is... b_,,6),0,,..., 80 that a,*b, = 6,,0, then 
for N « I,, the noise power amplification can be evaluated from the 
impulse reponses by: 


noise 

power _| SF 2 aera 

amplification > | > | en) 
(for white noise) 
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The Z-transform version for N « [,, is: 


noise 


power ~ Vall i | z-*\ateya(2)} ie (28) 


amplification 


(for white noise) Jaca) k=0 


It can be readily seen from equation (26) that: 





Lemma 2: When A ts Toeplitz, the noise power amplification will be 
unity whenever 


a(e)a( +) = constant 


whether or not the noise ts white, so long as tt is Toeplitz. 


An equivalent statement is that when A and N are Toeplitz, a sufficient 
condition for the noise power amplification to be unity is that A'A = J,, , 
which is equivalent to a(z)a(1/z) = constant. 

The above lemma will be seen to be especially significant after it 
is proved that unity noise power amplification is the least; which can 
ever be obtained, and when it is shown that simple a(z) functions, 
namely all-pass functions, obey the conditions of the lemma. Notice 
that the noise power amplification definition was based upon a receiver 
which performed the inverse of the encoding operation, and not upon 
a receiver which made a least square estimate of the signal given the a 
posteriorz noise statistics. Consequently, statements about least possible 
noise power amplification are not applicable to adapative types of 
receivers such as those employing T'mran - 

The following theorem is for block codes with n = k. 


Theorem 1: When square block coding is used and N is proportional to 
the identity, then the noise power amplification vs always greater than or 
equal to one, and it ts one only when A is proportional to an orthogonal 
matrix. 


Proof: What is required is a demonstration that: 


or i (tr A(A7)Yftr AA) = 1 (29) 
and 


(2) Equality occurs if and only if A is proportional to an orthog- 
onal matrix. (30) 


These are established in Section 2 of Appendix B, 
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The following corollary is the Toeplitz matrix limit version of the 
above. 


Corollary 2: When A and N are Toeplitz, and N is proportional to 
I, a necessary and sufficient condition for unity noise power amplifi- 
cation is that AtA = I,,, which is equivalent to a (1/z) a(z) = con- 
stant. Otherwise the noise power amplification 1s greater than one. 


When stationary (shift-register) linear-real decoding is used, then 
the decoding filter passes the noise through a Z-transform transfer 
function. When the noise is statistically stationary, the expected 
value of the mean square of the output noise is stationary, and de- 
pends only upon the amplitude of the transfer function averaged over 
the values of z. However, for burst noise the variance of the mean 
square of the decoded noise does depend upon the phase of the trans- 
fer function. For burst-noise or impulse-noise channels, this variance 
is minimized if the impulse response from the noise to the analog 
output of the decoder consists of many small terms instead of a few 
big ones. 

For quantized signals it is important to minimize the variance of 
noise power because fluctuations above the mean of the variance in- 
crease the error rate far more than fluctuations below the mean of 
the variance decrease it. In order to make the variance of the noise 
power small, the impulse response from noise to analog output must 
be near its peak for many times longer than the periods of fluctua- 
tion in the noise process. 

Because trace and expected value operators commute, the expected 
value of the output mean square error can be found by substituting 
E(N) where N appears, provided the noise process is stationary. This 
cannot be done for error probabilities after the quantizer, however. 


4.4 All-Pass Z-Transforms 

A Z-transform a(z) is defined to be all-pass if |a(z)| = constant 
for |z| = 1. These are the Z-transform version of two-sided Laplace 
(or Fourier) transformed all-pass functions. Figure 2 shows some im- 
portant properties of all-pass Z-transforms, including the fact that 
a(z)a(1/z) = constant is an alternative definition of an all-pass 
Z-transform. The proofs of relationships in the figure not proved 
previously are straightforward. The practical implications of these 
relationships are that all stationary (shift-register) linear-real coders 
should have Z-transforms which are all-pass, in order not to in- 
crease the noise power amplification. 


a(Z) IS ALL-PASS 
(so BY DEFINITION | a(Z)| = CONSTANT FOR |2| =1) 











@q ARE THE 


SAMPLE VALUES 
OF AN {MPULSE 
RESPONSE 
CORRESPONDING 


To AGw), NOISE 
WHERE POWER 
|A( jw)| AMPEG AUON 
= CONSTANT 
lwl < 9/7 
A (Jw) =0 
|w| =a/r 


{ 

a(z)a (4) 
= CONSTANT 
FOR ALL Z 











a(z) Is 
a,(Z,x) = 
2% 


2-x 
a2 (z,x2) = 
€,(Z,x) + a,(Z,x*) 
OR ANY 
PRODUCT 


OF SUCH 
FUNCTIONS 


IF a (Z) 
HAS A POLE 
AT 2, THEN 

IT MUST HAVE 

A ZERO 

AT 1/2 





IF a (2) 
HAS A ZERO 
AT Y, THEN 

IT MUST HAVE 

A POLE 


AT WV/Y 





Fig. 2— Some important relationships for all-pass Z networks. 


b(z) is 
ALL PASS 
AND 
a =b q 
a") 
q 
FOR 7 INTEGER, 
aq= [@) 
FOR NOT 
INTEGER 
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V. A. Kisel’ has made an excellent short study of all-pass Z-trans- 
forms, with a view toward using them as phase-correcting networks.** 
He has shown that networks whose Z-transform transfer function 
are of the form 


1 + Bye + Boz" + Bae" 
Bs + Bz + Be + 2° 
are all-pass, and that Fig. 3 synthesizes such functions. Additional 


modifications are added to this basic structure and implementations 
are proposed in the next section. 


a (z) = 


V. IMPLEMENTATION STUDIES 


The decoder for block coding with adaptive mean decoding appears 
to require a large modern digital computer, and even then it could 
probably only operate “on line” with a slow channel and a block 
size not much over one hundred. Further research may lead to A 
matrices for which (Q-! + A'N-A) can be easily inverted for realistic 
Q and N, or further research may lead to quicker inversion proce- 
dures, but with the present techniques, block coding with adaptive 
mean decoding appears to be decidely less practical than other meth- 
ods of error control. 

The decoder for unadaptive block decoding appears to be generally 
feasible if certain simplifying techniques are used. The most impor- 





OUTPUT 








Fig. 3—A_ shift register (real-number arithmetic) whose Z-transform transfer 
function is all-pass. (After V. A. Kisel’, with modifications and a correction.) 
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tant ‘of these is the use of an A matrix which is a permutation ma- 
trix* times diag(Ao, do, ...-, Ao), where Ao is itself a matrix. Ao 
must be large enough to give adaquate smear, whereas A must be 
large enough to make error burst lengths considerably shorter than 
the length of a code word. The hardware simplification achieved is 
that the inverse of the small Ao can be repeatedly applied in time by 
the same hardware so as to invert the larger A. The practicality of 
block coding appears to be slightly overshadowed by stationary 
(shift register) coding, which offers somewhat simpler circuits and 
freedom from the problem of block synchronization. 

Stationary (shift register) coding appears to be the most practical 
form of linear-real coding. In effect, such coding is a smear-desmear 
type of signal processing whenever the encoding and decoding filters 
are inverses of each other and of the all-pass type. The fundamental 
reason for the practicality of shift register all-pass filters is that 
accurately tuned shift registers can be relatively inexpensively 
synthesized, even when the dispersion times are several seconds. This 
is partly so because the “absolute” tuning of a shift register is deter- 
mined by the clock pulses and not the precision of the components 
used in making the register, and partly because the “relative” tun- 
ing in a shift register is controlled by gains which in practice can be 
resistor values. As will be seen, analog shift registers can be imple- 
mented digitally, in which case complexity grows only as the loga- 
rithm of accuracy. In RLC filter synthesis, in contrast, cost grows 
rapidly with accuracy. 

Figure 4 is a block diagram for coding of the basic stationary (shift 
register) type. The decoder, because it must handle the analog signals 
from the channel instead of the digital input signals, is selected to 
have the impulse response simplest to implement, namely an all-pass 
causal {/a(z) obtained by a shift register made from a tapped delay 
line with a relatively moderate number of taps. The encoder is con- 
sequently left with approximating the noncausal a(z), which it does 
with a delay by means of a tapped delay line. 

The decoding shift register of Fig. 4 can be implemented by the 
arrangement of Fig. 5, which is a particular synthesis of the all-pass 
shift register shown in Fig. 3. In Fig. 5 all the digital-to-analog con- 
version is done by resistor summing networks. This is relatively in- 
expensive, although it does require that the flip-flop registers be de- 
signed for relatively precise voltage levels on the “on” and “off” states. 


* A permutation matrix is a matrix with a single one in each column and 
each row; it is always nonsingular. 
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TAPPED DIGITAL FILTER REGISTER 

DELAY LINE (MEAN O (SEE FIG. 5 QUANTIZER 
APPROXIMATING SQUARE SAMPLER | FOR DETAILS) 
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SAMPLING INSTANT 
DETERMINER 
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CHANNEL 
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O O 
SUMMER SAMPLER 





Fig. 4— One possible general arrangement for unadaptive stationary (shift- 
register) linear-real encoders and decoders. For multilevel signals, a Gray encoder 
can be used before the analog summer, and the quantizer would incorporate a 
Gray decoder. 


Notice that in Fig. 5 there is only one analog-to-digital converter, 
because the analog feedback signal is added to the input signal before 
the conversion which is necessary in order to place the signals in 
the digital delay line. 

The cost of the encoding and decoding shift registers will be roughly 
proportional to the amount of smear that they introduce. The amount 
of smear necessary for given performance depends upon the noise 
power. It follows that a considerable economic saving can be obtained 
at given performance if circuits, inexpensive compared to the decoder, 
can be found to reduce the noise during bursts. 

A new circuit with this purpose for PAM systems is as shown in 
Fig. 6. The operation of the circuit requires that the interval between 
signal pulses be longer than the Nyquist interval for the bandwidth 
of the pulse shape. A way to find part of the noise component is to 
sample at the sampling instants, reconstruct the waveform which 
would be transmitted if these sample values were the data-signal 
values, and then subtract this signal from the actual received signal. 
(For proof of this statement, see appendix C.) An estimate of the 
instantaneous noise power can be made directly from those noise 
components which can be found. These components, for example, can 
be used to deduce the presence or absence of a noise burst. The circuit 
in Fig. 6 can obtain some noise components,* provided that the taps 
> Specifically, Fig. 6 obtains the sample values of A(t) of Appendix C at 
t = nT/2, n integer. Notice that by construction, A(n7'/2) = 0 for n even. By 


the sampling theorem, just the samples of A(t) will be sufficient to reconstruct 
A(t) provided that C'(w) is zero for |w| > 2 2/T. 
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RESISTOR SUMMING NETWORK 


REGISTER 
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TO 
PARALLEL 
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1 
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1 
ANALOG PARALLEL | | 
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1 y | t I | 
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| 
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Fig. 5—A possible arrangement for implementing the decoding shift register. 
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Fig. 6 — A stationary (shift register) coder with an adaptive decoder for PAM 
channels with white burst noise and pulse rates less than the Nyquist rate. 


on the delay line represent the PAM pulse value at t = n7’/2, n odd. 
The output noise estimate (specifically A(n7'/2, n odd, in the lan- 
guage of Appendix C and the previous footnote) is then squared to 
produce the sample variance of the noise; then the sample variance 
function is put through a smoothing filter, as shown in Fig. 6. The 
optimization of this filter is complicated by the absence of an ap- 
propriate error criterion, but Wiener filtering principles could be used 
to optimize a mean square criterion. The problem formulation would 
specify that the sample variance is the true ensemble variance con- 
taminated by small sample-size noise, and that the cross-correlation 
between the halfway sample process and the sample process could be 
found from the autocorrelation function of the channel noise. 
Finally, a two-input nonlinear memoryless filter is used, also shown 
in Fig. 6. It is reasonable to optimize this filter using a mean square 
criterion because in the limit of infinite smearing only the power of 
the noise will be significant because of the smearing and Gaussianizing 
effects of the decoding shift register. Some improvement may be pos- 
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sible by using other criteria, but the details appear to be very dif- 
ficult and are unsolved. 

The general scheme of Fig. 6 appears to be the most economical 
form of linear-real coding when the channel is used for PAM at less 
than the Nyquist rate. Telephone lines are used at less than the Ny- 
quist rate because they are used with signals with nonsharp-cutoff 
frequency characteristics. Radio links can obtain information on non- 
tuned burst noise, such as static, by listening on adjacent frequencies, 
and could therefore provide the smoothed estimate of instantaneous 
noise power, needed as an input to the two-input memoryless filter, 
by other means. Instantaneous carrier-to-noise ratios could be used 
for carrier systems, for example. 

It is also possible to use a different principle of instantaneous noise 
power estimation which does not require a PAM channel used below 
the Nyquist rate. The other principle uses the quantized structure of 
the data stream. It is implemented by a decoder with a “pilot” decoder 
which decodes, followed by an operator which squares the difference 
between the signal and the nearest quantization level, which is then 
smoothed and put into a two-input memoryless filter like that of Fig. 
6, following which is the regular decoding shift register and quantizer. 
This scheme is probably less practical than Figs. 4 and 6, but it does 
give conceptual insights into some of the signal properties which can 
be used in decoding, especially for burst channels. 


VI. COMMENTS AND SIMULATION RESULTS 

Any sample of the decoded noise is a weighted sum of the random 
channel noises at many other sample instants. When the number of 
terms in this sum approaches infinity and the relative size of the larg- 
est term in the sum approaches zero, the central limit theorem applies. 
It will probably be true that practical designs will not have the con- 
ditions of the central limit theorem fulfilled to the extent that very 
small digital error probabilities can be computed by using integrals 
of the tails of the gaussian distribution. 

Nevertheless, the fact that the decoded noise at any instant is a 
sum of the random channel noises at many instants will tend to make 
the decoded noise have some of the characteristics of a gaussian dis- 
tribution. One characteristic that the decoded noise will have is the 
small probability that the decoded noise is larger than three or four 
standard deviations. This effect of the decoding filter (or matrix) 
will be called the gaussianizing property. 

The use of nonlinear filters in conjunction with linear-real coders 
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is extremely effective, since such filters can considerably reduce both 
the noise power and the probability that the noise has a large peak. 
By reducing the probability that the noise has a large peak, the 
desirable gaussian distribution of the decoded noise occurs with 
smaller matrices, smaller shift-registers, or simpler all-pass filters. 
In the limit when the decoded noise is actually gaussian, the noise 
power is the only significant statistic; the higher-order moments of 
the noise become insignificant due to the gaussian-distributing prop- 
erty of the decoder. It is therefore quite appropriate to design the 
nonlinear filter using a mean square error criterion, as is done in 
Section vir of Reference 9. 

Linear-real coding has features which could greatly improve error 
detection in channels with burst noise. When erasure zones are used 
to detect errors, the gaussian-distributing property of the decoder 
greatly increases the ratio of the probability in the erasure zone to 
the probability beyond the erasure zone. In addition, the noise spread- 
ing gives more opportunities for a signal to land in an erasure zone in 
the presence of impulses or bursts, because of randomness of the 
decoded noise, and, with suitable designs, because of deterministic 
reasons. 

If the communications channel is, in order, digital processor to 
analog transmitter to analog receiver to digital processor, then linear- 
real block coding permits the energy per transmitted data digit to 
be altered by reprogramming the digital processors, instead of physi- 
cally retuning bandwidths of analog equipment. Although this option 
does not in itself affect error control, it perhaps could greatly simplify 
the implementation of adaptive communications systems in which the 
signal energy per digit is adjusted to be appropriate for the transmis- 
sion conditions, message importance, or message load. 

A digital computer simulation was run of an additive-noise channel 
with a linear-real block-code encoder at the input, and several types 
of decoders at the output. Table III shows the results of the simula- 
tion. The listed results are averages. The A matrix is the Hadamard 
matrix which is generated recursively according to the procedure de- 
scribed by Golomb and his colleagues (p. 55, first paragraph in proof 
of Theorem 4.5).1° The N matrix had zeros in all off-diagonal com- 
ponents, and independent random variables on the diagonals, which 
were 0.38 with probability 0.7 and 8.3 with probability 0.3. In ac- 
cordance with Theorem 4 in Appendix p, these can be worst-case 
values which then give the worst-case decoded mean square error. 

Once the N matrix was generated, the channel noises were gen- 
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TABLE II] —SIMULATED PERFORMANCE OF LINEAR-REAL CODERS 


Mean estimator 


Unattenuated 
estimator 


Unadaptive 


estimator 


Clip estimator 
parameters 


(1.2, 0.9, 4.0) 


Clip estimator 
parameters 


(1.0, 0.75, 3.0) 


Clip estimator 
parameters 
(0.8, 0.6, 2.0) 


Clip estimator 
parameters 


(0.6, 0.6, 1.5) 


Clip estimator 
parameters 


(0.5, 0.5, 1.3) 


Clip estimator 
parameters 
(0.4, 0.5, 1.0) 


MS ERROR IN 


DECODED COMPONENTS 


When receiver 


uses perfect 
N matrix 


0.516 


2.821 


When receiver uses 
N = diag (ni’,°**" , 


n 
where ni’ = max (0.3, fi? gs 1) 


0.711 


2.821 


2.821 


1.805 


1.152 


0.771 


0.677 


0.645 


0.649 


COMMENTS 


The lower bound of 


equation (18) is some- 
what loose; it gives 
0.297. 


Equation (12) has cor- 


rectly predicted that 
the error would be the 
same as that of the un- 
adaptive estimator be- 
cause A is square. 


Equation (7) averaged 


over the possible N 
matrices gives Mm.s. 
error of 2.70. The ran- 
domness of the N 
matrix accounts for 
difference. 


Channel: Additive noise channel sending +1 and —1 binary numbers and block 
encoding with an A which is k~} times the first k columns of an n by n Hadamard 


matrix. 
n = 16. 
k = 16. 


Number of words in simulation: 10. Noise type: Zero-mean white Gaussian noise 
has variance 0.3 with probability 0.7 and variance 8.3 with probability 0.3. 
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erated randomly from a Gaussian distribution having the given N 
for a covariance matrix. The clip estimator used a decoder which 
first put each received component through a memoryless nonlinearity, 
and then decoded the resulting components with the unadaptive esti- 
mator. The parameters (x, y, 2) indicate that the nonlinearity is a 
continuous odd function having slope 1 for inputs of magnitude less 
than xz, and slope y for inputs of magnitude between x and z, and 
slope 0 for inputs of magnitude exceeding z. These parameters can be 
chosen to approximate the least mean square memoryless nonlinear 
filter referred to earlier, or they can be found by a trial-and-error 
procedure with either analysis or simulations to evaluate the resulting 
error. 

The following two conclusions can be drawn from the simulation, 
but it would not be appropriate to generalize them to cases of non- 
square A matrices: 


(1) For intermittent additive impulse noise of the type simulated, 
the simple clip estimator scheme, for appropriate parameters, is 
almost as good as the mean estimator, even though it is unadaptive 
and therefore requires only a simple receiver. 

(wi) The use of rather crude algorithms for generating an sdiinate 
of N appeared to be inferior to clip estimator decoding with appro- 
priate parameters. 
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APPENDIX A 


Justification of Table 2 Equations 
A.1 Unadaptive Estimator 


In the case of the unadaptive estimator TA = I(x), so equation 
(3) reduces to equation (4) shown in Table II. Now in general, when 
M is the covariance matrix of the decoded noise, the mean square 
error will be the average of the on-diagonal terms of M, or in other 
words, (1/k)tr M. In this way (5) follows from (4). Equation (6) 
follows from (4) because A'A = (n/k)I(x xx) in this case. 

A Hadamard matrix is a square matrix with +1 or —1 elements 
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and orthogonal columns. (Golomb and his associates fully describe 
Hadamard matrices and their application to binary block codes.**) 

In deriving equation (7), a straightforward evaluation of (6) under 
the assumption of diagonal N gives the result that 


k 2 sn 
M;; => (#) > A1;A1j;N1 . 
NM Ta1 
assuming: 
T is the unadaptive estimator 
N = diag (nm, M2, °** , Ma)- 


The on-diagonal terms of the above can be evaluated by using the 
Hadamard assumption, which causes (ay)? to equal 1/k for all J and 


T is the unadaptive estimator 


assuming: 


N = diag (nm, , M2, *** , Ma) 

A is 1/(k)? times the first k columns of any Hadamard matrix. 
Notice that the term in brackets is AM, the arithmetic mean of the 
set (m1, N2,...., Mn). 


A.2 Unattenuated Estimator 


Equation (8) comes from (8) by direct substitution for the T 
matrix. 


A.3 Mean Estvmator 
In the case of the mean estimator, 
Tux — TA = Laxey — (QQ? + A'NTA)TA'N A 
= Taxn — (Q* + ANA) (A'N7A 4+ Q7 — Q™) 
= (91+ A'N7A)7Q97. (31) 


Substituting (Q7™* + A'N~*A)Q™ for axi) — TA) in equation (3) 
readily shows that 
M we (Q"* “fs A'N~*A)™* 


assuming 7’ is the mean estimator. 
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A.4 Joint Mean and Unattenuated Estimator 


The unattenuated estimator is the special case of the mean esti- 
mator when Q-1 > 0. It is convenient to handle the two cases together 
by using the variable © = 1 when the mean estimator is used, and 
Q = 0 when the unattenuated estimator is used. 

The next two equations use an approach from Berkowitz.*? Equa- 
tion (9) or (11) can be simplified by using the fact that, for any 
nonsingular Z, 





ae 1 
eZ" = GH 


where A;(Z) denotes the it® unordered eigenvalue of Z. The result is 
equation (12). When the signal is white, the relation A;(7J + Z) = 
+ + \(Z) can be used, giving the equality in equation (13). When 
Q = 1 the positive semidefiniteness of A‘N-1A causes its eigenvalues 
to be real and nonnegative; when Q = 0 the positive definiteness of 
A'tN-*A will now need to be assumed. Because 1/(Qs~ + A) is a convex 
upward function of in the region of possible A, the inequality part 
of (13) follows by convexity. This inequality will prove useful later 
when—under additional assumptions—the term in brackets will be 
found in closed form. 
For square orthonormal A, it follows that A-? = A‘, so 


ae 

(WN) 

Equation (14) results when the above is substituted into (13). Notice 
that when © is zero and N is diagonal, this will reduce to AM. On 
the other hand, when Q is one, this will be less than AM. 

When A is rectangular, the next analysis leads to a closed form 
solution for the average of the eigenvalues of A*N-A, under the as- 
sumptions of equation (15), and it also leads to upper bounds upon 
the m.s. error. The exact values of the components of A may enter 
into the formulas for some statistics of the error. However, in the 
first and second moment statistics to be investigated under the par- 
ticular assumptions made, it turns out that the only important prop- 
erty of the A matrix is the inner product between the 2 and 7% 
columns. This will always be (n/k) 8;;, independent of the particular 
Hadamard matrix upon which A is based. However, since higher- 
order moments are significant, especially in quantized channels, it is 
likely that some Hadamard matrices might be more useful for prac- 
tical purposes than others. 


\(A'N7A) = X,(ATN7A) = (N77) = 
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Under the assumptions of equation (15), straightforward calcula- 
tions will show the following. HM is the harmonic mean of the diag- 
onal components of N; equation (15) includes its formula. 


t —1 
(2) A'NVA = HM a Vi ‘ie ee a 
where, for large k 


E((Y;):i:] = 0 
nt alla (32) 
BUY) = 


(a2) z| } tr A‘N7 4| = i ak (33) 
(172) Var E tr A'N 4 = 7 a. (34) 


The above equations are especially useful because they show that 


* t 1 
the average of the eigenvalues of A‘N”- A = aa va 

This can be substituted into equation (138) to prove equation (16). 
Equation (16) becomes an equality when all of the eigenvalues of 
A'N-*A are equal; otherwise the mean square error is greater. 

Because the m.s. error evaluated according to equation (12) re- 
quires the computation of eigenvalues of typically a rather large 
matrix, or the trace formula of (9) or (11) yields little insight, and 
because the bound of equation (16) is a simple closed-form equation, 
the question arises of whether the bound given by (16) is really 
close enough to be used for design and analysis purposes as an equal- 
ity. The analysis which follows will derive an upper bound for the 
m.s. error, which could be used to develop some sufficient conditions 
for near equality of equation (16) 

Let equation (32) be used to define Y;,, let »’(Y;,) denote 


max |\,(¥2)| , 
and let + be any number such that 


>a Ins(¥.) [? 


7S) 80) 
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Notice that + can always be as large as 1 and never exceeds k. The 
second of the following inequalities is Schur’s inequality, which is 
valid for any square Y;,.7° The first comes from (35). 


(YD? S DDN? S VD Mal? (36) 


Assuming that k is large enough for the weak law of large numbers 
to hold permits (32) to be used to evaluate the above double sum, 
so that with a few manipulations (36) reduces to 


r’( Y,) Ss ee ; (37) 


By using equations (13), (82), (83), and (37), and a relatively ob- 
vious property of convex functions,* equation (17) is established. 
APPENDIX B 

Relating Teoplhitz Matrix Operations with Z-Transforms 

Theorem 2: If 


Ai; = a;-; 
and tf 
foe) 
a(z) = >> ag 
qe ; > 
converges on |z| = 1 and has no poles or zeros for a finite distance 


from |z| = 1, then A+ exists and 


oe 
a(z) 


a'(z) = 
Proof: Let 


b@) = a5 for |z|= 1. 


* The property is that if f(x) is convex downward, and 
2% =0, max|z;| SR, 


t 


then 


Lqlu ta) S Ylu — B) + flu t B). 


1092 THE BELL SYSTEM TECHNICAL JOURNAL, JULY-AUGUST 1968 


The assumptions on a(z) cause a, and b, to have geometrical decay, 
and therefore the following converge absolutely: 


(BA); = > BiAa = 2 bj- Aaj - 


a=—00 q=—0 


Also reducing to the above is (AB),;. Letting q’ = q — 7 gives 
(BA);; = (AB); = De Des-jy—q' Agr = by * Gy |:-; 


where the * denotes the convolution sum in the line above. Because 
b(z)a(z) = 1, it follows that 
bq * a, |.-7 = 4:,;- 


So BA = AB = I[,, thus proving that B is the inverse of A, which 
completes the proof. 
The following have proofs similar to that of the theorem. 


Lemma 3: If A and B are Toeplitz, then C = AB is Toeplitz with 
c(z) = a(z)b(z). 
Lemma 4: The half-power of a Toeplitz matrix N can be defined by 
nig) = Vn@). 
The following has a straightforward proof: 
Lemma 5: If A ws Toeplitz, then A‘ is Toeplitz and a'(z) = a(1/z). 


The following relates linear-real coding for Toeplitz matrices with 


Wiener filtering. 


Theorem 3: When A, Q, and N are infinite Toeplitz, then the least 
mean square estimator 


T = (Q? + A'NA)A'N™ 


ws the infinite Toeplitz, and the noncausal Wiener filter, given by 


aoa(’) 


t@) = . 
ne) + a(t )a@ae) 
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Proof: By Theorem 2 and Lemmas 8 and 5, 


Gee —1_—.a(#) ls, 
1 2 
1 o(2)ate 
q@) "  n@) 
This equals the stated result, which completes the proof. 


Corollary 3: When A = I. 
T= (7 +N) NW 
and 


_ 2) 
1) = Te +n® 


is the noncausal Wiener filter. 


The following proof of equation (29) and statement (30) follows 
the ideas of J. E. Mazo. For square A, 


tr [A“'(A™)'] = tr (A) ‘AI, 
since in general tr HC = tr CH for square H and C. Now let B = 
AA'*. Notice that (A*)#A+ is B+. Equation (29) is then: 


ar tr B tr B 1. 





But 
k 
trB = > 4,(B) 
a=1 
z 1 
ie. 
B= 2B 
so 
1 k 
k De \,(B) 
1=1 ‘ 


2 tr BU trB = 


1 
eee 


The numerator and denominator are respectively the arithmetic and 
harmonic means of the eigenvalues of the B matrix. Hardy, Little- 


1094 THE BELL SYSTEM TECHNICAL JOURNAL, JULY-AUGUST 1968 


wood, and Polya (p. 26, special case of 2.9.1) show that this ratio 
always exceeds one, except when the eigenvalues are all the same, in 
which case it is one.*? This proves (29). 

At equality B has equal eigenvalues, and since it is symmetric the 
eigenvectors span the space and B is proportional to an orthogonal 
matrix: 


B=)U = PDP 
P'QDP 


because B is symmetric 
aa ie 


Therefore AA‘ = )I and A-1 = )A!, so A is proportional to an orthog- 
onal matrix at equality, thereby establishing (30) and completing the 
proof of (29) and (80), thereby completing the proof of Theorem 1. 


APPENDIX C 


Finding Noise Component 


In the text we discuss the circuit shown in Fig. 6 and state that a 
way to find part of the noise component is to sample at the sampling 
instants, reconstruct the waveform which would be transmitted if 
these sample values were data-signal values, and then subtract this 
signal from the actual received signal. 

The proof of this statement requires the use of the valid converse 
of the sampling theorem, which states that an arbitrary function 
with frequency components out to |w| = 7/7, cannot be reconstructed 
from samples every 7 seconds if 7’ > T,. If it is assumed that 


(i) h(O) = 1 
(iz) h(nT) = 0 
(dit) H(w) is nonzero for |w| < 7/T, 
(wv) Ty < T 
(v) The additive noise c(t) has components at all frequencies for 
which H(w) has components, 


then it follows that 


actual sample at t = c(t) + >) 7r,h(t — nT’) 
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predicted sample at ¢ based upon 
samples at n7', n = 0, +1, +2, --- = >> [r, + c(nT)Ih(t — nT) 


A(t) = difference of the above = c(t) — >> c(nT)A(t — nT). 
By using a well-known result in sampling theory’? the Fourier trans- 
form of A(t) can be written as either of the following. 


A(w) = $[A()] = C@) — H@) Dion T)e"*"" 


n2r 


= CW) — Hw) X of - non, 


By the converse to the sampling theorem, no H(w) will make A(w) 
zero for all w. Consequently, A(w) contains some components of the 
additive noise. If 7, = 7/2, then the direct sampling theorem shows 
that samples every 7'/2 are sufficient to reconstruct A(t). 


APPENDIX D 


The purpose of this appendix is to state and prove the following 
theorem. 


Theorem 4: Assuming 


(t) Channel I has additive noise c independent of the signal b 
(tz) Channel IT has additive noise g independent of the signal b 
(tit) c and g are zero mean, and each ts even about tts mean 
(iv) F(a) = p(|c| S a), (a ts defined to be nonnegative) 
(v) K(a) = pg] S @) 
(vt) In both channels signal plus noise are passed through the memory- 
less nonlinearity nl(_) at the receiver 
(vit) nl(x) ts odd 
(viit) nl(x) has a slope bounded between 0 and 1 for all x, and this slope 
ts monotonically decreasing in | x | 
(ix) The mean square errors of channels I and II are MSE, and MSE; , 
respectively. 
(x) Channel I is noisier than channel II in the sense that F(a) S K(a) 
for all a, which means that for every bit of probability density c has 
at +B, g has an equal amount at a distance which 1s at least +8, 


then 
MSE, = MSE. 
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(Thus, worst-case noise gives worst-case results with these non- 
linearities.) 
The next definition and lemma are used in the proof of Theorem 4. 
Let MS(a) denote the special case of MSE; of Theorem 4 when 


po(c) = 36(c + a) + $8(c — a) (38) 


where « is a positive constant, and 8( ) denotes the Dirac impulse 
function. 


Lemma 6: Under the conditions of Theorem 4, (OMS(a)/da) > 0. 
Proof: 
Ms@=[ | tO+e—diprWp.O@ ded. (39) 


Substituting equation (388) for po(c), integrating with respect to c, and 
then taking partial derivations with respect to e gives 








OMS) . f” {tae +) — | 
A B 
= [l(b — a) — WE) tw db. (40) 
C D 
Now 
[assumptions 7, 8] = [(C < 0 for b 2 0, A SO for b < 0] (41) 
[assumption 8] = [D = 0, B 2 0] (42) 
[assumption 8] = [B = D when b 2 0, B 2 D when b S Qj. (43) 
Therefore 
—CD = —CB when b20 (44) 
AB = AD when b<0O. (45) 
Consequently 
aM OES 


=f 4 -Orpma+ [U4 - OBR a. 46) 
oi 


A-C= [. tn) 4 rp (47) 
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and by assumption 8 the integrand is nonnegative, so both sides of 
(47) are nonnegative. This fact and (42), and the nonnegativeness 
of pi(b), make the right side of (46) nonnegative, which proves the 
lemma. 

Proof of Theorem 4: The assumed evenness of the noises, and the 
linearity of the expectation operator, permit the MS(a«) function to 
be used to evaluate the mean square error, as follows 


MSE, — MSE, = [ ” MS(o) dF(2) — / ” MS(o) dK(a). (48) 


The above right side can be combined into one integral, such that 
integrating by parts gives zero for the end conditions plus the result- 
ing integral. 


MSE, — MSE, = [ FO eee ose g da. (49) 


Assumption 10 makes the bracketed term nonnegative, whereas 
Lemma 6 makes the braced term nonnegative, so the right side of 
(49) when integrated is nonnegative, which proves the theorem. 


REFERENCES 


1. Knox-Seith, J., unpublished work. 

2. Anderson, R. R. and Koll, V. G., unpublished work. 

3. Stamboulis, A. P., unpublished work. 

4. Gibson, E. D.,, aN Highly Versatile Corrector of Distortion and Impulse 
Noise,” Proc. Nat. Elec. Conf., 23, 1961. 

5. Lerner, R. M. “Design of Signals,” Chapter 11 of Lectures on Communication 
Theory, ed. E. J. Baghdady, New York: McGraw-Hill, 1961. 

6. Holland-Moritz, E. K., Dute, J. C., and Strember, F. G., “Feasibility of the 
Swept- Frequency Modulation Technique,” Report 4435- 16-F, Radar Lab- 
oratory, Inst. Sci. and Technology, University of Michigan, August 1962. 

7. Wainwright, R., “Overcoming Impulse Noise Interference in Narrowband 
Data Communication Systems by a Sophisticated Filter Technique,” Rixon 
Eng. Bull. No. 70 (July 1960); also Rome-Utica IRE Conf., October 1960. 

8. Helstrom, C. W., “Topics in the Transmission of Continuous Information,” 
Westinghouse Res. Laboratories Report 64-8C3-522-R1, August 27, 1964. 

9. Pierce, W. H., “Linear-Real Coding,” IEEE Int. Conv. Record, part Vil 
(1966), pp. 44-53. 

10. Berkowitz, S., Ph.D. Thesis, Carnegie Inst. Technology, 1966. 

11. Smith, D. H., “A Magnetic Shift Register Employing Controlled Domain 
Wall Motion, ” TEEE Trans. Magnetics, 1, (December 1965), pp. 281-284. 

12. Ragazzini, J. R. and Franklin, G. F., Sampled-Data Control Systems, New 
York: McGraw-Hill, 1958. 

18. Grenander, U., and Szego, G., Toeplite Forms and Their Applications, 
Berkeley, Calif., University of California Press, 1958. 

14. Kisel’, V. A., “Phase Correcting Circuits Using Delay Lines,” Telecommuni- 
reed and Radio Engineering (Elektrosvyaz, Radio Tekhnika), Decem- 

er 1965. 

15. Golomb, 8. W., Baumert, L. D., Fasterling, M. F., Stiffler, J. J., and Viterbi, 
A. J., Digital Communications, Englewood Cliffs, NJ.: Prentice-Hall, 1964. 

16. Schur, I., Math. Ann., 66 (1909), pp. 488-510. 

17. Hardy, G. H., Littlewood, J. E., and Polya, G. Inequalities, New York: Cam- 
bridge University Press, 1959, 


Matrix Multiplication and Fast 
Fourier Transforms 


By W. MORVEN GENTLEMAN 
(Manuscript received January 29, 1968) 


Factoring a matrix and multiplying successively by the factors can 
sometimes be used to speed up matrix multiplications. This is, in fact, 
the trick which creates the fantastic gains of the fast Fourier transform. 


The same trick which creates the fantastic gains of the fast Fourier 
transform may be used with other matrices. 
As an example, suppose the matrix 


1 -—10 + 3 —14 12 
—5 2 —-20 —7 6 —28 
2 —20 1 6 —28 3 
—20 1 —10 —28 3 —14 
4 —5§ 2 12 —7 6 
—10 4 -—-5 -—-14 12 —7 
is to be multiplied by a large number of different vectors, so that it 
is worthwhile to try to be as efficient as possible. At first glance, it 
would appear that (neglecting the possibility that multiplications 
by one might not actually be performed) multiplying this matrix 
with a single column vector would require 6? = 36 multiplications 


and 6(6-1) = 30 additions. The crafty person, however, might notice 
that this matrix may be written as the product of two matrices: 


124 ‘ : ail ys es 
en eee | ee 

241 | 1 3 
eT a ee 7 

4 1 2 Sat a 
oO. AAR INS a ei 
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The zero elements in the decomposed form have been written as 
periods to emphasize that these elements need not really enter into 
the computation when either of these matrices multiply a vector. 
In view of this, multiplying sequentially by the two factors would 
require only 6(2) + 6(3) = 30 multiplications and 6(1) + 6(2) = 18 
additions. . 

If we are really concerned about efficiency, more can be done by 
taking into account other special elements. For example, observing 
that 1 or —1 require only an addition or subtraction would save 3 
multiplications in the original form, and 9 multiplications in the 
decomposed form. Other savings could be made if some of the ele- 
ments of a column were negatives of other elements in the same 
column. 

In the three years since the fast Fourier transform was first pub- 
lished,t there have been numerous accounts of what it is and why it 
works. The more mathematical of these tend to explain it in terms 
of the fact that the quotient group of a cyclic subgroup of order 
MN relative to its cyclic subgroup of order M is itself a cyclic group 
of order N. Those accounts written by computer people usually con- 
sider the binary representation of the time and frequency indices, 
and observe how each bit enters into the summed products. And 
accounts written by engineers invariably explain the algorithm in 
terms of merging the spectra of suitable decimations of the original 
series to form the spectrum of the original series itself. 

These approaches are, of course, all quite valid, but they miss the 
essence of the fast Fourier transform which is, in fact, contained 
in the example above. If we wish to multiply a matrix M by a column 
vector x, it may be possible to find a factorization M = AB such 
that forming first y = Bx then z = Ay requires less multiplications 
and additions than would forming z = Mz directly. The factors 
A and B might themselves be able to be factored further profitably. 

The fast Fourier transform is a special case of this, where the matrix 
of interest is the finite discrete Fourier transform matrix whose ele- 
ments are exp 277(tt/N) for é and ¢ from 0 to N — 1. It is really quite 
irrelevant that the factors turn out to be (except for a permutation and 
phase shifts) block diagonal matrices where each block is of the same 
form as the original matrix—this fact is only used in showing that the 
factoring can be continued.* 

Indeed, the example above has exactly the same structure as a 

*In fact, for the fastest programs it is not even quite true. See Bergland2 


The factors there are not equivalent to each other as the “twiddle factors” have 
been redistributed to increase the number of coefficients having simple forms, 
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6 = 3 X 2 point fast Fourier transform, except that the nonzero 
elements in the factors are different. And it achieves exactly the 
same savings that the fast Fourier transform does in this case. Even 
the comments about taking advantage of explicit plus or minus ones 
or negatives of other elements in the column reflect features cur- 
rently in the better fast Fourier transform programs. 

Having seen that the possibility that matrix factoring will speed 
things up is not unique to the finite Fourier transform, we might 
ask when we can expect to take advantage of it. It is immediately 
evident that it does not improve things all the time. We cannot, for 
example, reduce the number of operations required to multiply by a 
diagonal matrix. Can we then identify those matrices for which it 
is useful? Unfortunately not, except by exhibiting a factorization 
with the required property. 

At this point it is useful to observe that, taking advantage only 
of zeros and ones, there always exist factorizations which do at least 
as well as the original matrix. This is trivially true if one of the 
factors is some permutation matrix, but more interestingly so if we 
consider factors generated by row (or column) elimination as used 
in the Gaussian elimination method of solving simultaneous linear 
equations. In matrix terms this process is based on the observation 
that 


ie My. °°? = 
Mo, Mon °°° Mey 
= i i Mi Mye oe Min | 
r lL lbmeay — Tm, Mo — TM 1° Mon — PMNyy 


The parameter r is then chosen to make one of the elements in the 
second row vanish. Since this means that the right factor takes one 
less multiplication and one less addition than the original matrix did, 
and since the left factor clearly only requires one multiplication and 
one addition, the total number of operations for the two factors is 
exactly the same as for the original matrix. 

In other words, row (or column) elimination preserves the number 
of operations required to form the product of the matrix with an 
arbitrary vector. This assertion assumes, of course, that in the elim- 
ination we do not destroy more special elements (such as zeros or 
ones) than we create. In fact, if we can create more of these special 
elements than we had before, we have won: we have achieved a 
factorization requiring less operations than did the original matrix. 
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Notice that in the above example we used a nonsquare matrix. In 
fact, nothing in the whole discussion suggested M be square, and 
considering nonsquare matrices is no more difficult than considering 
square ones. An immediate application of this is to the case where 
a set of only a few Fourier coefficients are required from a large 
number of very long sequences. Up until now, usually the best that 
could be done was to compute the complete fast Fourier transforms 
and discard the unneeded coefficients. 

But it is apparent that by carefully factoring the matrix consisting 
of those rows of the finite Fourier transform matrix which are of 
interest, a more efficient algorithm can be produced, tailored to the 
problem. A reasonable factorization to start from might be fast 
Fourier factorization of the complete matrix. This is illustrated below 
for the case where three coefficients are wanted from an eight point 
transform. The four factor matrices are the reordering and the three 
passes of the Cooley factorization. Only those rows of each matrix 
which are marked by arrows need actually be computed. (W = exp 
[ (22) /8], explicit negatives and ones are represented as such). 


> |1 ‘ 1 1 1 1 1 1 
—|1 W Ww? we -i1 —-W -W* —W 
—}1 Ww? —-1 —Ww’ 1. Ww? -1 —W 
1 we —-wW Ww -l —-W* WwW? —W 
I -l 1 aed 1 —] 1 = 
1 —-W Ww? —W* —-1 Ww —-W’ w* 
1 -wW* -1 Ww’ 1 —-Ww* -1 Ww? 
i -wW -W -W -—i w* WwW’ W . 
—j1 1 —|1 1 
—| 1 W —| 1 WwW? 
= 1 Ww —!1 -1 
7 1 we 1 -W? 
1 —I —> 1 1 
1 —W => 1 WwW? 
4 Ww? = | 
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—|1 1 aad 
—j|; 1 -1 —> 1 
= 1 1 => I 

i 
> 1 -1 => 1 
— 1 1 ¥ 1 
at 1 —-1 im a ae ee ee | 
—> 1 bl a 1 
el 


We could also have regarded -k7 as special elements. 

Our suggestion then is that if one has a matrix which he wants 
to multiply efficiently into a great number of arbitrary vectors, it 
might be worthwhile to try to find a factorization of the matrix such 
that multiplying sequentially by the factors is cheaper than multiply- 
ing by the original matrix. Indeed, it is worthwhile to try to find an 
extremely good, perhaps even the best, such factorization. 

Since we cannot identify a priori matrices for which this can be 
done, let alone give an algorithm for finding the best or even just 
a good factorization, the best we can recommend is to generate trial 
factorizations and compare them. A useful tool for this is row (or 
column) elimination: because of the invariance property mentioned 
earlier, such a factorization cannot lose much, and might gain. As 
an exercise to the reader, we suggest deriving the factorization of 
the matrix given at the beginning of this paper, or the eight point 
fast Fourier transform above. Notice that in the case of the fast 
Fourier transform it is useful to express the matrix in real arithmetic 
before reducing it, because then it is more obvious how to go further 
in the reduction, since in the computer it is usually the number of 
real operations that counts. 
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On a Class of Configuration and 
Coincidence Problems 


By Z. A. MELZAK* 
(Manuscript received August 28, 1967) 


Let A and B be sets in E” where B ts convex and symmetric about o. 
Let n points be taken in A and let B; be the translate of B centered at the 
ae one. Let Y be the subset of the Cartesian product A", corresponding to 
the configurations (B,, --: , B,) such that no more than p — 1 sets B; 
intersect, or corresponding to any similar configuration condition, expres- 
sible in purely Boolean terms. The problem of evaluating various integrals 
over Y generalizes a number of questions in queuing, telephone traffic, 
statistical mechanics of hard spheres, and so on. This article gives a complete 
solution for certain special cases, and discusses numerical (Monte Carlo) 
techniques. 


I. INTRODUCTION 


We consider here a number of problems of the following general type. 
Let A and B be two sets in the m-dimensional Euclidean space £” (m2 1). 
B is assumed to have a center of symmetry and for any point z B(x) 
denotes the translate of B centered at x. An integer n(n = 2) is fixed 
and the n-fold Cartesian product A X A X --: X A is denoted by P. 
IfweP then u = (a, --:,2,) where x; 2A fort? = 1, --- , n; we shall 
be interested in the sets B(x,), --- , B(z,). By a configuration condition 
we shall understand a statement referring to the relative positions of 
the sets B(x,), --- , B(w,) and describing their intersection properties 
in purely Boolean terms. 

Examples of admissible configuration conditions are: (2) the n sets 
are pairwise disjoint, (72) their intersection is empty, (272) their union is 
connected. A configuration condition which generalizes (7) and (zz) is: an 
integer p is given (2 S p S n) and no p of the n sets intersect. Any admis- 
sible configuration condition C' induces a partition of P into two disjoint 
and complementary sets Y = Y(C) andN = N(C);ifu = (,,-°-:-, a,)e 

* University of British Columbia, Vancouver. 
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P then wu « Y if and only if the condition C holds for B(a,), --- , B(a,). 
Finally, a function F = F(x,, +--+ , x,) is defined over P and dV denotes 
the volume element dz, --- dz, . Our problem is to evaluate the integral 


y= | Pav. 


In all cases to be considered the sets A, B, and Y, as well as the 
function F’, will be sufficiently regular so that the questions of meas- 
urability and integrability will not arise. In fact, in most cases of 
interest B turns out to be a ball, a cube, or an m-dimensional regular 
octahedron. All these are Minkowski balls for a suitable norm p(é) = 
p (&,,...,&n). We get the Euclidean ball with 


o@ = (Soe) 


the cube with p(é) = max (&,..., &,), and the octahedron with 
z 


L 
3 
’ 


p= Del. 


It will be therefore assumed throughout that B is a Minkowski ball. 
This amounts simply to assuming that B is a convex symmetric body. 
The precise shape of A is of no particular importance, only its con- 
tent and sufficient regularity are. 

The integrand F will be usually of some highly symmetric type 
such as 


F= 1, f= IL se, f= II f(| Ui — Xj Re 


lsi<jsn 


where f is a suitable sufficiently regular function. 

In this part of the paper we are concerned with certain special 
configuration conditions which lead to an explicit expression for J in 
terms of the so-called cluster-integrals. Later we consider a related 
expansion of the form 


7=0 
where the parameter \ measures the ratio of sizes of B to A. We shall 
take up the questions of the existence of the expansion (1) and the 
regularity of J as a function of A. 
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YI. EXAMPLES 


Example 1 
Let m = 1, A is the interval [0, Z] and B is the interval (0, a], 7 is 
any integer such that (n — l)a S L, the configuration condition is 


that the sets B(z,), --- , B(x,) are disjoint, and F = 1/L. J is now the 
probability that with n points at random on the interval [0, Z] no two 
points are closer than a. 


Example 2 
Let m, A, and B be as above, 


Pee y +++ ym) = IT te) 


where f(x) is a probability density on A. The configuration condition 
is: p is an integer (2 S p S n) and some p-tuple of the sets B(x,), --- , 
B(x,) is to have a nonempty intersection. Here we have the following 
interpretation: [0, Z] is a basic time interval and n events occur during 
that time. Each event occurs independently of the others with the 
probability density f(x). A p-fold coincidence is defined to be the com- 
pound event arising when some p events occur closely together—on a 
time-interval of length a. Now J is the probability that a p-fold coinci- 
dence occurs. 

The above examples show that problems of our type might be of 
interest in queuing theory, telephone traffic, the theory of particle 
counters, and in similar areas. The next example is a scattering 
problem for a random linear array of n identical isotropic point- 
scatters, no two of which can be too close together. 


Example 3 


Let m, A, B, and C be as in example 1. We suppose that the wave- 
length is 27 and that L is an integral multiple of it. Aside from propor- 
tionality factors the signal scattered by the array is the vector (&, ») 
where 


n n 
= > cos2, , n= dosing; . 
1 


1 


Weare here interested in the probability P(u, v) that 


us&éSutdu and vSynsS0+ad. 
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The Markov method! gives 


P(u,v) = (Qn)7J4" / [ e 1" Bir s) dr ds 


where 


Br, s) = [ exp | (+ 2D cosz; +s s sin n) | dv. 


J, and Y, are the integral and the region of example 1, respectively. 
Therefore the spectrum B(r, s) is obtaincd in the form of our integral 
J if we take 


F — Il f(x,), f(a) = ei? coszt+s eine) 
1 


When a = 0 then P(u, v) reduces to the probability density for the 
isotropic plane random walk of n unit displacements in arbitrary 
directions. 


Example 4 


Let m = 3, let A be any large and sufficiently regular portion of 
space, and let B be the ball of radius a. The configuration condition 
is that no two sets B(a;) and B(a;) overlap. There is a suitable given 
function (x) and 


F(x, , So In) = II gre lara, 
lsi<jsn 
Now, aside from some simple normalization factors, J is the so-called 
partition function for a hard-sphere model of idealized gas with inter- 
molecular potential » and the hard core radius a.? 

The knowledge of J is here of considerable importance in statistical 
mechanics and a great deal of work has been done on the subject of 
evaluating J in the form (1) which is closely associated with the so- 
called virial expansion. 


III. A SPECIAL CASE 


The method to be used involves certain dissections of Cartesian 
products together with the inclusion-exclusion principle of combina- 
torics.2 As an illustration and an introduction to the more complex 
examples which follow, we consider here at some length example 1 
of the previous section. The material is taken from Ref. 4, where some 
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further details can be found. The well-known solution® ® is here 


J = J(n, a, L) = [1 — (xn — Ya/L? (2) 

and it may be obtained analytically as follows. 
Let the coordinates of the n points be x, , --- , x, ; these can be ordered 
in n! ways. Suppose that 0 S x, S x, S --- Sx, S L; the conditions 


of the problem are satisfied if and only if 
O0S%S52%,—-aS 2 — 2a 
S:---Sa,-(n-lasL—-(m-— Da. (3) 


Let yi = 4% — (t—l)a (2 = 1,..., 2), then the probability that (3) 
holds is £-” times the volume of the region in EH” consisting of the 
points y = (y1,..., Yn) for which 


VSn Sh SS Sy,5 44> G=— Lea: 


The volume in question is [L—(n—1)a"]/n!; since there are n! equi- 
probable orderings we get (2) at once. 

Consider next an alternative geometrical proof of (2), which is 
considerably more complicated, but leads to useful generalizations 
and gives some additional insight. 

First, let n = 2. The sample space of pairs (#, , 22)(0 S$ x,, v2 S L) 
is the square Q of side-length L, lying in the first quadrant of EH’ and 
containing the origin as a vertex. Let D be the diagonal of Q through 
the origin and draw the two lines parallel to D at the distance 24a 
from it. The hexagonal subset of Q contained between those two lines 
is the sample space of the forbidden configurations with |x, — z| S a. 
The remainder of the square Q consists of two congruent triangles which 
can be moved together so as to form a square Q, , of side-length Z — a. 
By the randomness assumption J/(2, a, LZ) is the ratio of the areas of 
Q, and Q which yields (2) for n = 2. 

The case of arbitrary n is handled similarly. In HE” we take a 


Cartesian coordinate system with the n axes X,,..., X,. The n- 
dimensional cube 
AH = {(a,,°::,%):05 2; S$ L,i=1,---,n} 


is then the sample space of all n-tuples of points on the segment [0, Z]. 
Let J; be the interval [0, Z] on the X,;- axis. In the two-dimensional 
square face Q;; = I; X I; of H let D;; be the diagonal through the 
origin and let H;; be the hexagonal subset of Q;; consisting of all points 
no further from D,; than 27*a. Let S;; be the Cartesian product of H,; 
with all the J,’s for which k 4 i andk # j. 
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S;; 18 now the sample space of the configurations which are forbidden 
on account of too close approach of the points x; and 2; for the chosen 
indices 7 and j: |z; — z,;| S a. The sample space Y of the allowed con- 
figurations is therefore the set 

H- U Si. 


lsi<isn 


When the (") paradiagonal slabs S;;, based on the paradiagonal sets 


H,;,;, are removed from H, the remainder of the cube A consists of n! 
congruent simplexes which can be reassembled by suitable translations 
so as to form a smaller cube H, of sidelength L — (n — l)a. By the 
randomness assumption J(n, a, ZL) is the ratio of the volumes of the 
cubes H, and H, and so (2) is proved again. 

The above procedure works on account of a lucky geometrical accident 
of the fitting of n! simplexes. If A and B were some other, m-dimensional, 
sets, we could still form the paradiagonal sets and slabs and we could 
attempt to find the volume of the union LU S;; of all the paradiagonal 
slabs. This is essentially what is done in the next section by means of the 
inclusion-exclusion principle’. 


IV. SIMPLE COINCIDENCE WITH SEPARABLE INTEGRAND 


In this section we are concerned with a configuration condition cor- 
responding to simple coincidence: wu = (x,, +++ , 2) © Y if and only if 
for some 7 and j B(x,;) and B(z;) intersect. Subject to general restric- 


tions, A, B, m, and n are arbitrary. We let N = (") and we form the NV 
paradiagonal sets 
Hy; = {@s, 2): Ba:) 1 BGs) # 4} 
and the N paradiagonal slabs 
Si; = {(@, +++ , t,): B:) 1) B;) ¥ 9}. 


Let the slabs be enumerated by a single index as {S,},k =1,...,N. 
Then an application of the inclusion-exclusion principle gives 


Be? | Se 


1Ski<ka<eee<kr SN es “Sk, 


pe [ Fav rav | (4) 


n 


> (—1)""'K, ; 


r=] 


I 
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With the general integrand no further elaboration of (4) is possible. 
Suppose now that F has the separable form 


F = II fl). (5) 


With the double-index enumeration of the S,,’s the first term K, can 
be written as 


nea > = Ils) av 


1Sti<fisn 


and since all the N paradiagonal sets are congruent, we have 


k= v(f f(z) de) [ T(x) f(v2) dx, dx . 


For reasons which will be clear shortly we write 


Nn=N, fi fa)av=Jo, f fedfles) dr dee = Jur (a) 
A Ha 
so that 
K, = Nude Ju . (6b) 
Similarly, the second term Kz in (4) is 
eS ae | II fe) av 
(taeda) (fasta) YStyi ~NSigig 1 


where the summation extends over all distinct pairs (7,, j,), (42, je) 
such that 1 S74, <j, Sn, 1S th < je S n; no regard is paid to the 
order of pairs; [(1, 2), (8, 4) is the same as (8, 4), (1, 2)] so that there 


are exactly 
(3) 
2 


such pairs of pairs. There are two types of these: N2, pairs like (1, 2), 
(8, 4) with all four indices different, and N 2. pairs Jike (1, 2), (1, 3) with 
one shared index. By a simple calculation 


No = n(n — 1)(n — 2)(n — 3)/8, No» = n(n — 1)(n — 2)/2, 


eceiets (2) (7a) 
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and in analogy to (6) we set 


4 
a: —= / II f(z;) az, dx. dz. dx, 
HrelQ\U a6 


1 


= | L. f(a) f(a2) day, ae | = Ji (7b) 


3 
Jn= [UE te) de des ds 
Hry2O0His 


1 
so that 
KK, = Nido. Jax + Nosda day . (7c) 
The main purpose of this section is to develop formulae, analogous 
to (6) and (7), for the general term K, of (4). The principal dif- 
ficulty here is that in passing from the single-index formula for K, 


n 


K,=)>> 


LSkice+eck+sN i is 


f(a.) dV (8a) 
1 
to the double-index formula 


K.= > ee, i i f(a) dV (8b) 


(i1,71) biiNeeOSiyip 
we need an adequate description of the different types of 1-tuples 
of pairs of indices occurring in (4), together with a hold on the range 
of summation in (8b). For instance, with r = 2 there are two such 
types, illustrated by (1, 2), (8, 4) and (1, 2), (1, 3). With r = 3 there 
are five types of index-sharing in triples of pairs: 


(1, 2), (3, 4), (5, 6); (1, 2), (1, 8), (4, 5); CL, 2), (2, 3), (3, 4); 


(1, 2), C1, 3), (1, 4); G1, 2), (1,8) (2, 3); (9) 

We may therefore expect that the formula for r = 3, analogous 
to (7c) for r = 2, will have five terms rather than two. The number 
of such types grows very rapidly with r, and as an aid we introduce 
certain graphs associated with the terms of (8). These graphs reflect 
completely the intersection properties of the sets B(z,),..., B(2,). 
For r = 38 there are five such graphs corresponding to the five types 
enumerated in (9). These are given in Fig. 1 together with the cor- 
responding B-configurations. (It is, of course, assumed that n > 2.) 

Each graph is of the following kind: 


(i) No vertex is isolated. 
(ii) No pair of vertices is connected by more than one edge. 
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JN ATIN IA 


4 


80 ® P B & 


Fig. 1 — Coincidence graphs, r = 3. 


(zit) No edge connects a vertex to itself. 
(iv) There are exactly r edges. 
(v) There are exactly v vertices. 


One further, and crucial, condition is added: 


(vt) If the v vertices are enumerated in some order then there exists 
a configuration of v translates B,, --- , B, of B, such that B; 
and B; intersect if and only if the 78 and the j* vertices are 
connected by an edge. 


For the sake of convenience we make here the following conven- 
tion: two convex m-dimensional bodies will be said to intersect only 
if their intersection is itself m-dimensional, otherwise they are to be 
regarded as disjoint. The reason for this is that we are interested in 
purely metric properties: the intersections of such sets serve as do- 
mains of integration for well-behaved functions in H™. 

A graph satisfying conditions 2 through v2 will be called a (B, 7, v)- 
graph, one satisfying 7 through iv and v2 a (B, r)-graph, and one 
satisfying 7 through 2 and v2 a B-graph. It must be emphasized that 
the condition vz is not of the usual graph-theoretic kind and it pre- 
vents many graphs from being B-graphs. For instance, let m = 2 
and let B be a circular disk. Since a disk in H? cannot intersect six 
congruent pairwise disjoint disks, the graphs of Fig. 2 are not B- 
graphs. 

The proof of the above assertion for the graph of Fig. 2b is obtained 
by showing that here the “extreme” configuration is that of Fig. 3. 

Similarly, when m = 2 and B is a square then B cannot intersect 
five pairwise disjoint translates of itself (for each translate contains 
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(a) (b) 


Fig. 2— Graphs which are not B-graphs. (B is a disk.) 


a vertex of B) so that the graph of Fig. 4a is not a B-graph. On the 
other hand, the graph of Fig. 4b, which corresponds to that of Fig. 
2b, is a B-graph as shown by the configuration of Fig. 4c. 
Returning to the evaluation of K,, we start with (8b). Summation 
there extends over all the 
(@ 
2 


distinct r-tuples of pairs of indices where for each pair (7, , 7,1 $7, < 
je Sn; 7r-tuples differing only in the order of pairs are not considered 
distinct. We can now associate the terms of (8b) in a 1 : 1 fashion with 
the distinct (B, r)-graphs on some n vertices w,, --- , w,. Given a 


B-graph G let 
S(G) = (\ Si; ; (10) 


where the intersection is taken over all pairs (2, 7) for which w, is con- 
nected to w; by an edge in G. Then (8b) may be written as 


K= Df Wieday, (11) 


(G) i=1 





Fig. 3— An extreme B-configuration. 
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(a) (b) 
Fig. 4 — Configurations when B is a square. 


the summation running over all distinct (B, r)-graphs on n vertices. 

Let v(G) denote the number of vertices of G and C(G) a connected 
component of G. Since the integrand in (11) is completely separable, 
the integral over S(G) splits into a product of integrals over the con- 
nected components and we get 


K, = x Jo"? TT J(C@). (12) 


c(G@) 
Here J[C(G)] is an integral over the connected component and the 
product is taken over all such components of G. Two examples of 
integrals J[C(G)] are given in (7b). Owing to the congruence of all 
the paradiagonal slabs and the form of the integrand, it is not neces- 
sary to sum in (12), over all (B, r)-graphs on the vertices w1,... , 
Wn, but only over their types. 

Suppose that there are exactly ¢ = ¢(r) types of such graphs and 
let G; be any one of the j type; let also N,;(n) be the number of 
different (B, r)-graphs on the vertices w1,..., Wn, of the j type. 
Then (12) becomes 


t(r) 


K, = DN, An)Jo°°? TT J[C@,)]. (13) 


C(G;) 

Thus the problem of evaluating J has been reduced through (4) 
and (13) to: the geometrical problem of determining the types of 
(B, r)-graphs, the combinatorial problem of calculating the coef- 
ficients N,;(n), and the analytical problem of evaluating the cluster- 
integrals over the connected (B,r)-graphs. 


V. MULTIPLE COINCIDENCE WITH SEPARABLE INTEGRAND 


Formulae analogous to those of the previous section will now be ob- 
tained for the case of p-tuple coincidence. Subject to general conditions, 
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A, B, n, and m are arbitrary and F is of the separable form (5). An in- 
teger p is fixed (2 S p S n) and the configuration condition is: u = 
(21, °°:,2,) © Y if and only if there are p indices 7, , --- ,27,(1 S17, < --:: 
<1, S n) such that 
M B(a;,) ~ ¢. 

We observe here our convention that the intersection must be it- 
self m-dimensional. We introduce the analogs of paradiagonal sets 
and slabs: 


Tice: = {(es joes t)) : A Bee.,) w a} ’ 
Diavavig => {Ce gion th) : 7) Bees.) a} ’ 


we let M = a) and we re-enumerate the sets S,,...;, with a single 


index k as {S,}, 1 S k S M. Then we get a formula analogous to (4): 


p= frav= cy D-- } nav 
Y r=1 lski<eee<kr SM YSEN88OSke 
= Di (-1)'"9, . (14) 
r=1 


As in (6a) we let 


M.=M, [i@de=4, | 
A 
to get 


I] f(a) dx, +++ dx = Ji, 
-p il 


LQee 


U, = Mido "Jun ° 
In terms of p-tuple indices the second term Us, of (14) is 


U, = Il f(a.) dV. 


Cia,eec,tp) Cir, ee *stp) Ie eee 


The summation extends over the (") distinct pairs of p-tuples. We have 


now p types of such pairs, depending on the number of shared indices, 
which may be 0, 1, --- , or p — 1. Let M.,; be the number of p-tuple 
pairs of type j (that is, with 7 — 1 indices shared) and put 


2p-T+1 


II f(x.) dx, tae Cs near ; 


oe im] 


Jo; = 
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then 


> 
Us > Nagle es 
j=l 

Observe that the integral /., splits into a product: J.; = Jj,. To 
get an expression for arbitrary U, we introduce a higher-dimensional 
equivalent of B-graphs. Let X be a regular simplex in Z”~* on the vertices 
W,,°** , W,. On account of properties 7 through 777 listed in Section IV, 
a (B, r)-graph is simply a set of certain r edges (or one-dimensional 
faces) of X. A d-dimensional hypergraph G will be just a set of some of 
the ( ea. ) d-dimensional faces of X. This takes care of properties 
2 through 777. When there are r such faces in G we shall speak of an 
(r)-hypergraph and when these faces comprise between them v vertices 
of X, G will be called an (r, v)-hypergraph. 

An equivalent of the important condition (v7) is very naturally 
obtained: there is a B-configuration of v translates B,, --- , B, of B, 
such that any d + 1 of them, say, B,,....,:,,, intersect if and only if 
Wi,, °** » Wee, are the vertices of a d-dimensional face of X included 
in G. Components, types, and so on, for (B, r)-hypergraphs are defined 
in the same way as before. For instance, a hypergraph G is connected 
if no plane disjoint from it can strictly separate some of its d-faces from 
others. All quantities such as C(G@) and v(G) have the same meaning 
as before. Let ¢ = ¢ (r, d) be the number of different types of (B, r)- 
hypergraphs, let G; be any one hypergraph of the j** type, and let 
M%,(n) be the number of different (B, r)-hypergraphs of the j type on 
the n vertices. Then, proceeding as before, we get the equivalent of (13): 


t(r,p-1) 


U,.= dL MR@serr" TT JIC@,)). (15) 


VI. SOME COMBINATORIAL PROPERTIES OF B-GRAPHS AND B-HYPERGRAPHS 


Let y(r) and y(r) be the smallest and the largest number of ver- 
tices, respectively, in a (B, r, v)-graph G. From conditions 7 through 
wm we have at once y(r) = 2r. G is then minimally connected with r 
components (Fig. 5a). Suppose that r is a triangular number: r = 
s(s—1)/2; there is then a complete graph on gs vertices which is 
clearly a B-graph for any B, so that s = v. If r is not a triangular 
number let {(f-1) < 2r < ¢(é+1) and put e = r—t(t—1) /2. 

Let G be the complete graph on ¢ vertices. For the corresponding 
B-configuration we may assume that the translates B,, --- , B, of B 
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(a) (b) (c) 


Fig. 5 — (B, r, v)—graphs with high v. 


have an interior point in common. We check that e < tand that B,,---, 
B, may be arranged so that a point z « (){B; can be strictly separated 
from U!,, B; by a plane P. Let B,,, be a translate of B which contains 
z and lies strictly on the same side of P as z. Then the resulting B- 
configuration B,, --- , B,,, corresponds to a (B, 7, v)-graph G with 
v = ¢+ 1. This G may be said to be a maximally connected (B, 1, v)- 
graph. We have now 


Wr) = 2r, g(r) = min {7:7 = [1 + (+ 8r)4]/2}. 
Similarly, let g(r, d) and y(r, d) be the corresponding minimum and 
maximum of v for a (B, r, v)-hypergraph. Then clearly y(r, d) = (d+ 1)r. 


To determine ¢(r, d) we suppose first that r = . There is then 


S 
d+1 
a complete hypergraph on s vertices, consisting of all the d-dimensional 
faces of an (s — 1)-dimensional simplex. This hypergraph is a B-hyper- 


graph for any B and sov = s. If 


t at) 
ere | 


we proceed as before and find that v = ¢ + 1. Hence 
ver, d) = @+ Ir, of, @) = min {7 


j = largest pos. root of ez — 1) --: @ —d) = d+ 1). 
The bounds g(r) and y(r) lead us to the possibility of a combina- 


torial identity 
@) eS (") | 
(¢ = es Avs k : (16) 
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and its relation to B-graphs and the numbers N,;(n). For instance, 


we find for r = 2 
(3) = af) +36) 


(®) is the total number N2, + Nx» of graphs of (7a) and 


n n 
3(") = Na, 3(") = No. 


To prove the validity of an expansion like (16) for all observe 
that the left-hand side is a polynomial in n of degree 2r = y(r) so 


that 
(8) = & nC) 


Further, A,, = 0 fork < g(r), for we substitute successively n = 0, 
1, --- , g(r) — 1 in (16) and recall that (?) = Ofor p < r. By expanding 
both sides of (16) in powers of n and comparing the coefficients we find 
A,or = (2r)'/2'7!, Arora = (Qr — 1)1/2""(r — 2)}, 
A,ore = (2r — 2)18r — 1)/3.2""(r — 3)! 


and so on. Therefore (16) may be written as 


0) 


I 


(n)or/(27!) + (M)ora/[2" “7% — 2)! 


+ (M)ar-2/[8.2°"@ — 3)! Br —D] + --- (17) 
2r—-e(r) 


pS (2)o,-;/D; . 


i= 


I 


(n), stands for n(n—) ... (n—p+1). 

The denominators D; have the following interpretation. Consider 
first the (B, r, 2r)-graph of Fig. 5a. The 2r vertices can be chosen out 
of wi,..., Wn in (n)o, ways. We define the symmetry number for 
a (B, r, v)-graph to be the number of ways in which its vertices can 
be labelled with integers 1, 2,..., v, all of which ways are to corre- 
spond to the same B-configuration. Here the symmetry number is 2’r}, 
as there are 2” ways of permuting the labels on the two vertices of a 
component and r! ways of permuting their components. This leads 
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us to the first term (n)2,/2’r! in (17) which is precisely the number 
N1(n) of (13) provided that we consider G, in (18) to be of the type 
of Fig. 5a. 

Similarly, for the (B,r,2r—1)-graph of Fig. 5b we find the sym- 
metry number to be 2’-*(r—2)!. The number of ways to choose the 
2r—1 vertices is (n)o,1 and so we get the second term (n)o,1/2"* 
(r—2)! of (17). The situation gets somewhat more complicated for 
the (B,r,2r—2)-graphs. Here we have three types instead of one, 
illustrated in Fig. 5c. The 2r—2 vertices can be selected in (7)o,2 
ways, the symmetry numbers for the three types are 


2°-7(r — 8)! 3.2"? (r — 3)! and 2°'(r — 4). (18) 


Therefore, the corresponding numbers of graphs, say N,3(n), Ny4(n), 
N,s(n) are 


(1) or-2/[2" "(r — 8)!], (1) or-2/[8.2" "(7 — 3)!], (m)er-2/[2" "(7 — 4) 1] 


and their sum is precisely the third term of (17). The corresponding 
denominator Ds. is therefore three times the harmonic mean of the 
three symmetry numbers in (18). 

Thus the first few terms of (17) give the total numbers 


> N,;(n) 


of (B,r,v)-graphs for v = 2r,2r—1, and so on. However, this pleas- 
ing circumstance breaks down as soon as we reach the smallest term 
t for which one of the types of graphs in question is not a B-graph. 
For the case m = 2, B a circular disk, this occurs for 7 = 7 and the 
graph in question is then that of Fig. 2a together with other components 
containing one edge each. When B is a square the graph of Fig. 4a shows 
that the breakdown occurs for 7 = 6. On the other hand, the quantity 
(n)2,-:;/D; from (17) always provides an upper bound for the sum 
> N,,(n), the summation extending over all types j of (B, r, 2r — 1)- 
graphs. 
The explicit form of (16) is 


@)- £4.6) 0 


where g = g(r) and 


An = S-Di (G). (20) 
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We prove (2) by induction on k. For k = q, (20) holds, suppose it to be 
proved fork S$ q+s— 1. Letn = q+ 8 in (19), then 


_ “> _ s—1 (2 . 
Anais sat ( > q + 4 Apaei 


r 


which by the induction hypothesis may be written as 


Aron = (CO) FH a(t OP), 


In the double sum we may sum first over those terms for which the 
difference u = 1 — 7 is constant, then over u. In this way one gets 


dae) ea CY) 


which after some simple algebra becomes (20) with k = q+s. This 
completes the induction and the proof of (20). 

Some combinatorial identities may be obtained from the above. 
For example, we know that A,»., = (2r)!/2’r!. Hence, on putting k = 2r 
in (20), we get 


(=n! Oe ») = (on) Yarrt. (21) 


7=0 


Similarly, with k = 2r — 1 andk = 2r — 2 we get 


>) (-1! (as oe ”) Opa Tyre ON a 


r 


2r-q— Ca) 
Pee eres (ee 
= [(2r — 2)! (r — 1/8)]/2"'(r — 8)!. (23) 
For hypergraphs we have the identity 
(?) - 5 a.a(”) (24) 


where q = ¢(r,d). The explicit expression for the coefficients A,;(d) 
can be found in the same way as (20): 


Aald = ¥ (KO), (28) 


and 
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Some of the higher coefficients A, ¢-, Ar ar-1,... can be evaluated by 
comparing the powers of n in (24): 


A, ar(d) = (dr)!/rl(d!)’,  A-ar-i(d) = (dr)!d(r — 1)/2.r!(a))’” 
and so on, so that by putting k = rd and k = rd — 1 in (25) we get 


> p(T ) = (dr) Vel (aly (26) 


and 


, 


po (> t) ( y = (dr)! dly — Y/2ri(a’. 27) 


The coefficients A;,(d) have the same interpretation with hyper- 
graphs as the A;, have with ordinary graphs, and they refer to 
symmetry numbers. 


VII. SIMPLE COINCIDENCE IN A CUBE 


We consider here the problem of evaluating the probability P(n,a,L) 
that when n points are taken at random (uniform distribution) in a 
three-dimensional cube of edge-length L, then no two points are closer 
than a. The problem occurs in deriving the van der Waals equation 
from a primitive hard-sphere gas model. See, for instance, Ref. 2, 
where the problem is termed “very difficult” and the crude (though 
sufficient) approximation 


Ptn, a, L) & at (1 — 4rja°/3L*) & 1 — 2xn’(a/L)*/3 (28) 


is used. 
From our formulation we find that 


Ll _ P(n, a, L)) 


is the J integral for the case m = 8, A is a cube of volume L’, B a ball 
of radius a/2, and the configuration condition is that not all sets B(2;) 
be disjoint; in other words, a simple coincidence. Therefore by (4), 
(13), and an inspection of Fig. 1 we have 


L'"[1 — P(n, a, LY) = NyL Ty — (NLP Tay + Nod? Io) 
i Nee Ts oP Nelo Tes ae Nog Lag 
=f Neh ie + Noli bah) a, oe 8 (29) 
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where the integrals I11, I2:, . . . can be symbolically represented as 

follows 

=f, iim | eT Leet) In= f =Th, 
ne =: 5s = 


ines) dates In =f, ee Is = ff. 


To obtain an explicit Cartesian expression for an integral, I, we 
consider its signature graph G which is a (B,7r,v)-graph. If the v 


vertices are enumerated as 1,2,..., v in an arbitrary order then J 
becomes a 3v-tuple integral 
T= fi ve fdr --+ dr, (31a) 
Rr 


where 7; is the vector (%, yi, 2), dr; stands for dx, dy; dz;, and the 
region of integration RF; is given by 3v + r inequalities: 
OS2%,5L,08y51,082,8L, @=1,---,»),  (81b) 
Ir; — 7,|° S @ if the 7" and the j* vertices are connected (31¢) 
in G by an edge. 
Further, such an integral occurs in (29) with the multiplier V,,L°"? 
where N,; is the number of distinct graphs on n vertices, which are of 


the same type as G. Together with each such integral J = Ip, we may 
also consider the corresponding integral K,, given by 


Ky = fo fide dr, 
Qr 


where the region Q, is given by the (v* + 5v)/2 inequalities (31b), (31c) 
and 


lr; — r;|° 2 @ if the 7 and the j* vertices are not connected in G 
(31d) 


by an edge. 

It turns out that the J integrals are expressible in terms of the K 
integrals, and conversely. For instance, consider the K integral with 
the signature graph which has four vertices 1, 2, 3, and 4, and edges 
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12, 18, 23, and 24. We write it in a self-explanatory terminology as 


(12) (18)(23)(24)[1 — 4)]] — (@4)] 
and multiply this out to get 


(12) (13) (23)(24) — (12)(13)(28)(24)(14) — (12) (18) (23)(24) (34) 
+ (12) (13) (23) (24) (14) (24) (84) 


which yields at once a representation of K as a sum of four J-integrals. 
The first integral I,, is sixtuple and can be reduced to an iterated 
integral as follows: 


Me Mi 
ee ae / mm / dx, dx, dy, dy dz, dz» (82) 
where 


M, = Mm = mM, = 0, M_=M,=M,=L 

and 

m, = max {0, x — [a” — (yw. — ya)” — (2 — za)"}}, 

m, = min {L, x2 + [a” — (y — y2)” — @ — 2)"]}, 

ms = max {0, yo — [a°} (a — 22) ]*}, 

M, = min {L, y2 + [a” — (@, — 2)"}}}, 

ms = max {0, 2 — a}, 

M,; = min {L, z, + a}. 

This arrangement of the limits of integration corresponds to taking 
two balls of radii a/2 and centers (4%, yi, 21) and (%2, Ye, Ze), and 
letting the center of the first ball move freely over the cube while the 
coordinates of the second center vary so that the balls intersect. 
Accordingly, Z,; has a simple probabilistic interpretation: I,4, = L°[1 — 
P(2,a,L)|, where P(2,a,L) is the probability that two points taken 
at random in the cube of edge-length Z are no nearer than a. Similar 
probabilistic interpretation holds for any other K integral. If G is its 
(B,r,v)-graph then K is L*” times the probability that when v balls 
of radius a/2 are taken with their centers at random in the cube, then 
the balls are in the configuration of G (so that two of them intersect 
if and only if the corresponding vertices of G are connected by an 
edge). 

We evaluate now the integral (82) subject to the condition a S L. 
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Integration with respect to x, and 2, gives 
L’ — [max (0, L — D)/ (33) 
where 
D? = a — (y, — Ye)” — (@ — 22)”. 
Since a S L we have D S Land therefore (33) is 2LD — D’. Integrating 
this with respect to y, and y. we get first, on putting y, — yo = 4u, 
L pM 
[ [eve - 2) - @ = wl dude, 
where 
b? = a” — (2, — 22)°, m = min (y2, b), M = min (L — yo, d). 
Again: a S Limpliesb S L and the double integral is therefore 
aL’b’ — 8Lb°/3 + 0*/2. 
Finally, integrating with respect to z, and z, we get 
I, = 4ra°L?/3 — 3ra*L’/2 + 8a°L/5 — a°/6, OSaSL. (84) 


There are two more forms of J,, , corresponding to the ranges L S$ 
a < 2Land 2'*L < a S 3'L, but they do not appear to be expressible 
in terms of elementary or standard transcendental functions. It may 
be observed that the leading term in (84) is the product of the volumes 
of the cube and the ball of radius a. 

To get a better approximation to P(n,a,L) than (28), we examine 
(29) and find that for small a every integral J;;, beyond Iy1, is O (a). 
Therefore 


Pa, a, L) | 
= (") [40/3(a/L)* — 3n/2(a/L)* + 8/5(a/L)"] + O[(a/D)"). (35) 
It is possible to find the exact limit of P(n,a,L) as 
n> 0, a—>0, (42/8)(n"/2)(a°/L*) > b. 
For we have then P(n,a,L) = P(b) and 
eS P(b) = Nall — Naylo,/L™ + Nala/L* eee 
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and 
Nu ~ () ~ ne" /2*h? 


Lis = Chay 


This amounts to neglecting all graphs other than the “principal” one, 
for each k, that is, the one corresponding to the configuration of Fig. 
5a. Hence 


1 — PQ) =D (= 1d /2)0'/2)(0/ TY /j) = 1 = 6" 
so that 
P(b) = e™. (36) 


VITI. NUMERICAL EVALUATION OF THE I-INTEGRALS 


Since no I integral beyond I;; appears to be explicitly evaluable in 
terms of standard functions, the possibility was investigated of com- 
puting those integrals numerically by the Monte Carlo method. The 
first set of trial calculations was performed on Jy, itself, in order to be 
able to compare the results with the known true value. We assume 
as before that a < L and we put L = 1 (homogeneity!) to get 


T,,(a) =4.1888a° —4.7129a*+1.6000a°—0.1667a°, 05a 1. 


We now choose a suitable integer M and set the value of a at 1/M. 
Next, two points p1(%1, y1, 21) and po(%e2, Ye, 22) are taken at random 
in the unit cube by choosing each coordinate to be a random number 
from the rectangular distribution on [0, 1]. Such pairs of random 
points are selected N times; suppose that in N,; of them the distance 
between the random points docs not execed 1/M, then the quotient 
N;/N is taken as the Monte-Carlo approximation to [,,(1/M). Then 
the whole procedure is repeated with 1/M replaced by 2/M, 3/M, and 
so on, until the value 3'/? is passed. The whole calculation will be 
referred to as an N by M Monte Carlo run. 

In the first set of trial computations N by M Monte Carlo runs 
were executed for various values of N and M, and in each case a 
least-squares fit was done on these data by a polynomial of the form 


6 
iF Aa’. 
7=3 


The results are shown in Table 1. 
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TaBLE [—First TRIAL CoMPUTATIONS 


True value of A; 4.1888 —4.7129 1.6000 —0.1667 

Ist 1000 by 20 run 3.1873 —1.0742 —2.5488 1.3448 
2nd 1000 by 20 run 3.3296 —1.6727 —1.9088 1.1584 
10000 by 20 run 4.4765 —5.9918 3.4012 —0.9760 

1000 by 200 run 4.3008 —5.2689 2.4437 —0.5641 

10000 by 200 run 4.1974 —4.7337 1.6358 —0.1911 
100000 by 20 run 4.1546 —4.5615 1.4043 —0.0879 


It appears from this polynomial that very long and large runs are 
necessary to determine the coefficients with fair accuracy. However, 
the values of the integral itself can be computed quite well. To check 
this we have computed the standard deviations, both for the Monte 
Carlo data, from 


2 =1/M » (N,/N) — InGj/MDP 


and for the least squares fit from 


M 
a2 = 1/M pe [1iG/M) — InG/M)P 


where 


6 
I,,(a) = dX A,a’ 


is the least-squares fit to J,,. The results are shown in Table 2. 

As a compromise between accuracy and length of the Monte Carlo 
run, the values N = 10000 and M = 20 were selected. In this way 
there were computed the two integrals Jz; and Ize corresponding to 
the two (B,r,3)-graphs, the six integrals I4;, ... , I4g corresponding 
to the six (B,r,4)-graphs, and the 21 integrals J5;,... , Js21 correspond- 
ing to the 21 (B,r,5)-graphs. The first two series are shown in Figs. 
6 and 7. The programming was quite simple and no details need be 
given. The total time taken up on the CDC 6600 computer was about 
one hour; this, however, includes a lot of trial runs and tests. 


TaBLeE IIT —SranpDARD DEVIATIONS 








Monte Least 
N Carlo o1 Squares o2 
1000 0.01154 0.00753 
10000 0.00257 0.00183 


100000 0.000922 0.000554 
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Fig. 6 — Cluster integrals for v = 3. 
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Fig. 7 — Cluster integrals for v = 4. 
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To sum up, it appears that numerical computation of J-type inte- 
grals is quite feasible, with the help of an automatic computer, to 
fairly good approximation. One well known advantage of the Monte 
Carlo method of evaluating multiple integrals was clearly brought 
out; namely, its relative independence of the dimension. 
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Data Transmission with 
FSK Permutation Modulation 


By H. L. SCHNEIDER 
(Manuscript received November 8, 1967) 


Performance characteristics are derived for an FSK data transmission 
scheme in which M frequencies out of N are used simultaneously. Non- 
coherent matched filters are applied at the N frequencies, and the filter 
outputs are compared as in a permutation modulation system. 

It is shown that many permutation alphabets provide energy per bit 
advantage over binary F SK, although the best results are obtained with 
one-out-of-N alphabets. Considering bits per unit bandwidth, many per- 
mutation alphabets perform as well as or better than binary; however, one- 
out-of-N alphabets carry less information per unit bandwidth when N > 4. 


I. INTRODUCTION 


The technique of N-ary frequency modulation in which energy is 
transmitted on 1 out of N frequencies to convey log, N bits of informa- 
tion per character has been known for some years.’'’” David Slepian?® 
has recently described a general modulation system, permutation 
modulation, which is applied here to a multifrequency modulation 
scheme in which energy is transmitted simultaneously on M frequencies 


out of N, thus conveying log, (*) bits of information per character. 


Binary and one out of N FSK modulation are special cases of FSIC 
permutation modulation. 

Such a transmission scheme is basically not new; it has been used 
for many years for transmitting decimal digits, address, and other 
supervisory information in the telephone plant. This work was motivated 
by a requirement to compare the information transmission capability 
of these alphabets. However, the application analyzed here is, in fact, 
different because we assume a baud synchronous matched filter receiver 
with a mutually orthogonal set of signals. The channel is assumed to 
be nonfading, frequency flat, with white gaussian additive noise. 
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II. GENERAL DESCRIPTION 
For convenience, we shall refer to this modulation scheme as PFSK. 


The PFSK alphabet has e characters, 


(ir) = ar ant 


2) alphabet is the binary FSIK modulation with which other (*) 


The ( 1 
alphabets will be compared. The (*) alphabet is commonly referred 
to as N-ary or MFSIKX (multiple frequency shift keyed). 


PFSK transmission operates in a manner shown for the .) alphabet 


hoe one N — : 
in Figure 1. One of the .) characters is input to the transmitter; the 
N 
(S) PAIRS OF TONES 
{ 
FILTER ENVELOPE 
1 DETECTOR 
=! ENVELOPE i 
2 


Y 















DECISION 
CIRCUIT 


DETECTOR 





N 
OSCILLATORS 











CHOOSE 
2 LARGEST 
DETECTOR 

OUTPUTS 





ENVELOPE 


FILTER 
N DETECTOR 








TRANSMITTER CHANNEL RECEIVER 


N 
Fig. 1— Transmission system for ( ) alphabets. 
2 


signal out is M simultaneous pulses of energy, one pulse on each of M 
distinct frequencies, lasting for 7 seconds. White gaussian noise is 
added in the channel. Filters, matched to the pulse shape, are tuned to 
each of the N possible frequencies. The filter outputs are envelope- 
detected and all N envelope samples are intercompared at the end of 
the pulse period. The largest 17 of these outputs determine the trans- 
mitted character.* 


*Slepian has shown that this technique of amplitude comparisons minimizes 
the error probability. See Ref, 3. 
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iI. ANALYSIS 


An error is made in this process when any of the noise samples exceeds 
any of the signal plus noise samples. The error probability is P, and 
is one minus the probability of making a correct decision: 


PF. — 1 = / Dar(s)P y—a1(s) ds (1) 
0 
where 


p(s) is the p.d.f. of the smallest signal plus noise sample 
Py_m(s) is the distribution function of the largest noise sample. 


The p.d.f. of the smallest signal plus noise sample is determined as 
follows. The p.d.f. of the output sample of a matched filter detector 
can be written as* 


ply) = yloly-V3R) exp (—-+ 2) (2) 


where 


yis the output envelope sample amplitude normalized to the 
rms noise 
Io( ) is the modified Bessel function 
R is & M / Do 
S is the received signal energy in joules at each of the 1 trans- 
mitted frequencies . 
QM, is the noise density, in watts per Hz. 


The probability of the smallest of M samples exceeding a value s is 
the same as the probability that all M samples exceed the value s. 
This probability is expressed by equation (3), with independence of 
the M samples following from orthogonality. 


c= Pro | [rw ay |" = Q"(VIR, 8) (3) 


where 


P,(s) is the distribution function of the smallest signal plus noise 
sample 
Q(-, -) is the Q function and is tabulated by Marcum’. 


* We view equation 2 as a renormalization of an expression by Helstrom* for 
matched filter detection, although it was originally derived by Rice5 in a 
different context. 
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Thus, the p.d.f. of the smallest output is simply 
d = 
 u(s) = ist 1) = Qv “ V 2h, s)p(s). (4) 


Similarly, with independent noise samples, we find the distribution 
function of the largest noise sample. 


| [ “y exp (—y"/2) ay | 


Ecol 5") oo (9 


r=0 r 


P N-M (s) 


I 


(5) 


Substitution of equations (4) and (5) into equation (1) yields (after 
some labor) the character error probability: 


pam & (-ne(% —™) Pavan, ar 


r 
X slo(sV2R) exp | et De BE | ds (6) 


A closed form expression for the case M = 1 was found by Reiger.” 


= $8 c0() of] 
P(M = 1) = 5% di ( 1) ») exp R\1 : (7) 
A closed form expression for the case M = 2 is obtained from equation 


(6) using integration forms, having Q function integrands, given by 
Stein :? 


Pat = 2) = 525 Di-w(" >) 





x +r.) - a@,ale| Rr -4)]  @ 


where 





2Rr \? a 
a= (PR) 8-8 


Closed form expressions for cases of M > 2 are not known; how- 
ever, an asymptotic form for large R is obtained following arguments 
by Helstrom® for approximating the Q function: 
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ss M N-M+1 = ea) 
PM) © a p> ( ( A 


-exp | -(1 _~ 1) (9) 


or taking the predominant first term of equation (9) we have* 
P(M) & a exp (-2). (10) 


Equation (10) can also be obtained heuristically. At high signal- 
to-noise ratios, character errors occur because of a binary decision 
error; that is, one of the noise samples is mistaken for one of the 
signal plus noise samples. The probability of a binary decision error is 


P, = 1/2 exp (—R/2). 


In the multifrequency situation, there are M(N—M) ways for this to 
happen; the product of these two factors yields equation (10). 


Iv. COMPARISON OF ALPHABETS 


We interrelate the performances of the PFSK alphabets to those 
of binary FSK using two criteria: energy per bit required for an 
equivalent error rate, and bits per unit bandwidth.} First, the per 
character information of these alphabets is defined as k: 


k = log, ay 


The normalized energy per bit &/9, is related to the ratio R, defined 


in equation (2), by 
n= (iG) ay 


Since the quantity R appears in the exponent of the error rate 
expression, it is apparent that, for low error rates, the power advan- 
tage (over binary FSK) of a PFSK alphabet approaches k/M. We 
can observe this numerically by comparing error rates on the basis 


*It is easy to show that the first term is always an upper bound to Pe. 

+ The reader can compare the results of the work here with recent work of 
I. Jacobs, who intercompares coherent modulation systems using virtually the 
same criteria. 
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of “equivalent error probability,”*° which is the binary error proba- 
bility for which the probability of one or more errors in a binary 
sequence of k bits is equal to the probability of error in the PFSK 
case. This equivalent error probability is defined as Peg: 


Pye = =1- [1 ae (M))'" & + P (M). (12) 


Figure 2 illustrates P,, as a function of &/9t) for several alphabets. 
At error rates of 107°, the power advantage is within 0.6 dB of the k/M 


value for the @ 7. and € ,) alphabets, and closer for the other examples. 


107! 








EQUIVALENT ERROR PROBABILITY 

















‘/ No IN DECIBELS 


Fig. 2— PFSK error probabilities. 


The number of bits per unit bandwidth for a PFSK alphabet is 
determined by estimating the bandwidth as N times the frequency 
separation, which is 1/T for noncoherent orthogonal signals with mini- 
mum frequency spacing. Since the information rate is k/T, the desired 
bits per cycle ratio is simply k/N.* Figure 3 shows paired values of 10 
logio(k/M) and k/N for illustrative alphabets. 


' *Tt is easy to show that this ratio for PFSK alphabets approaches a maximum 
value of 1 for large N, with M = N/2. At this point k/M = 2 for a 3 dB 
advantage. 
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Fig. 3— Performance comparison of PFSK alphabets. 


Symbol A B Cc D E F G H I sf K L 

2 4 8 16 32 4 5 6 8 12 16 6 
Atpbabet (7) G) Ga) G”) G*) @) G) @) @) &) G@) G) 
V. SUMMARY AND CONCLUSIONS 


It has been shown that the PFSK technique gives significant power 
advantage over binary FSK. In addition, bandwidth can be controlled 
by the proper choice of alphabet. 

Disadvantages of the technique are practical ones. Implementation 
of the decision function is relatively complicated. In some applications 
peak power limitations might make the average power calculations 
inapplicable. 

A generally large number of characters in the alphabet is not suited 
to all applications, but can be very efficient in some. For example, the 


,) alphabet, containing 10 characters, is well suited to decimal digits. 
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